Socioeconomic Factors and Academic Performance: An Explainable and Predictive Analysis of ENEM in Rio Grande do Norte
DOI:
https://doi.org/10.1590/SciELOPreprints.14701Keywords:
Machine Learning, Explainable Artificial Intelligence (XAI), Educational Inequality, ENEM, Rio Grande do NorteAbstract
The National High School Exam (ENEM) is a central instrument for access to higher education in Brazil, but its results reflect profound inequalities. Understanding the factors that determine performance, especially at a regional level, is crucial. However, predictive Machine Learning (ML) models, while accurate, are often ”black boxes,” limiting their value for policymaking. This study addresses this gap by applying ML (Random Forest) and Explainable Artificial Intelligence (XAI) techniques, via SHAP, to analyze the performance of 51,091 students from Rio Grande do Norte (RN) in the 2022 ENEM. The results confirm that socioeconomic factors are the most influential predictors. The SHAP analysis quantified how key variables, notably parental education, family income, and school type (public vs. private), create an “opportunity gap” and polarize the results. By unpacking the predictive model, this work provides robust regional evidence on the determinants of educational inequality, offering a basis for developing equity policies that focus on mitigating the impact of socioeconomic origin on students’ academic futures.
Downloads
Posted
How to Cite
Section
Copyright (c) 2026 Rodrigo Tertulino, Ricardo Almeida, Laércio Alencar

This work is licensed under a Creative Commons Attribution 4.0 International License.
Plaudit
Data statement
-
The research data is contained in the manuscript


