Preprint / Versão 1

Classification Models Analysis for Stroke Prediction

article.authors6a1c9e97102e7

DOI:

https://doi.org/10.1590/SciELOPreprints.7182

Palavras-chave:

machine learning, Stroke prediction, classification models, data preprocessing, Random Forest, Support Vector Machine, healthcare, feature importance analysis, classification metrics, confusion matrices, public health

Resumo

This study explores the application of machine learning in the prediction of stroke occurrences, a critical task in healthcare with the potential to save lives and reduce the impact of this life-altering medical event. Leveraging the "Healthcare Stroke Data" dataset, we employed two powerful classification models, the Random Forest and Support Vector Machine (SVM), to forecast stroke likelihood. Our analysis encompasses data preprocessing, model training, and comprehensive evaluation using classification metrics and confusion matrices. The study reveals the trade-offs between accuracy, recall, precision, and the F1 score in both models. While the Random Forest exhibits higher accuracy, the SVM excels in recall, a crucial factor in healthcare. Precision challenges in both models highlight the need for further refinement. Additionally, we conducted a feature importance analysis, emphasizing the pivotal role of age, BMI, and glucose levels in stroke prediction. This work exemplifies the potential of machine learning in healthcare and contributes to ongoing efforts in improving stroke prediction and prevention.

Downloads

Os dados de download ainda não estão disponíveis.

Postado

27/10/2023

Como Citar

Classification Models Analysis for Stroke Prediction. (2023). Em SciELO Preprints. https://doi.org/10.1590/SciELOPreprints.7182

Série

Engenharias

Plaudit

Declaração de dados