Classification Models Analysis for Stroke Prediction
DOI:
https://doi.org/10.1590/SciELOPreprints.7182Keywords:
machine learning, Stroke prediction, classification models, data preprocessing, Random Forest, Support Vector Machine, healthcare, feature importance analysis, classification metrics, confusion matrices, public healthAbstract
This study explores the application of machine learning in the prediction of stroke occurrences, a critical task in healthcare with the potential to save lives and reduce the impact of this life-altering medical event. Leveraging the "Healthcare Stroke Data" dataset, we employed two powerful classification models, the Random Forest and Support Vector Machine (SVM), to forecast stroke likelihood. Our analysis encompasses data preprocessing, model training, and comprehensive evaluation using classification metrics and confusion matrices. The study reveals the trade-offs between accuracy, recall, precision, and the F1 score in both models. While the Random Forest exhibits higher accuracy, the SVM excels in recall, a crucial factor in healthcare. Precision challenges in both models highlight the need for further refinement. Additionally, we conducted a feature importance analysis, emphasizing the pivotal role of age, BMI, and glucose levels in stroke prediction. This work exemplifies the potential of machine learning in healthcare and contributes to ongoing efforts in improving stroke prediction and prevention.
Downloads
Posted
How to Cite
Section
Copyright (c) 2023 Dheiver Francisco Santos

This work is licensed under a Creative Commons Attribution 4.0 International License.
Plaudit
Data statement
-
The research data is available in one or more data repository(ies)


