Preprint / Version 1

Socioeconomic Factors and Academic Performance: An Explainable and Predictive Analysis of ENEM in Rio Grande do Norte

##article.authors##

DOI:

https://doi.org/10.1590/SciELOPreprints.14701

Keywords:

Machine Learning, Explainable Artificial Intelligence (XAI), Educational Inequality, ENEM, Rio Grande do Norte

Abstract

The National High School Exam (ENEM) is a central instrument for access to higher education in Brazil, but its results reflect profound inequalities. Understanding the factors that determine performance, especially at a regional level, is crucial. However, predictive Machine Learning (ML) models, while accurate, are often ”black boxes,” limiting their value for policymaking. This study addresses this gap by applying ML (Random Forest) and Explainable Artificial Intelligence (XAI) techniques, via SHAP, to analyze the performance of 51,091 students from Rio Grande do Norte (RN) in the 2022 ENEM. The results confirm that socioeconomic factors are the most influential predictors. The SHAP analysis quantified how key variables, notably parental education, family income, and school type (public vs. private), create an “opportunity gap” and polarize the results. By unpacking the predictive model, this work provides robust regional evidence on the determinants of educational inequality, offering a basis for developing equity policies that focus on mitigating the impact of socioeconomic origin on students’ academic futures.

Downloads

Download data is not yet available.

Author Biography

Rodrigo Tertulino, Instituto Federal do Rio Grande do Norte

Doutor em engenharia informática pela universidade de Coimbra, Portugal. Mestre em Ciência da Computação pela UERN/UFERSA. Possui graduação em Sistemas de Informação, também possui MBA em Gestão de Negócios pela UNP. Atualmente Professor Redes de Computadores no IFRN com dedicação exclusiva. Atua nas seguintes linhas de pesquisa: Redes e Sistemas Distribuídos, avaliação de desempenho de sistemas em rede, gerenciamento de redes. Engenharia de Software: Métodos ágeis e integração com abordagens tradicionais, desenvolvimento de software orientado a objetos, incluindo refatorações e frameworks. Segurança: Segurança em aplicações Web e PenTest. Atualmente desenvolve pesquisas sobre privacidade e segurança em sistemas Healthcare (EHR).

Posted

01/06/2026

How to Cite

Socioeconomic Factors and Academic Performance: An Explainable and Predictive Analysis of ENEM in Rio Grande do Norte. (2026). In SciELO Preprints. https://doi.org/10.1590/SciELOPreprints.14701

Section

Exact and Earth Sciences

Plaudit

Data statement

  • The research data is contained in the manuscript