Early Prediction of Stroke Risk Using Machine Learning Approaches and Imbalanced Data

Authors

DOI:

https://doi.org/10.56286/1vf19469

Keywords:

Decision Tree, Imbalanced Data, KNN, LDA, Naïve Bayes, Machine Learning Models

Abstract

Classifying medical datasets using machine learning algorithms could help physicians to provide accurate diagnosing and suitable treatment. For instance, stroke is one of the serious diseases that attacks many patients annually, and analyzing it is symptoms in advance could save patients’ lives. The warning signs of the stroke can be investigated to be used as attributes or predictors for machine learning models. This study evaluates the performance of four machine learning models to classify stroke datasets. Specifically, Decision Tree, Naïve Bayes, K- Nearest Neighbor (KNN) and Linear discriminant Analyses (LDA) models were trained on 11 attributes collected from 5110 patients to predict stroke risk. The findings showed that KNN outperformed the three other models with an achieved accuracy of 90%. The study also considered balancing the employed data prior validating the models to provide accurate classification. Cross-validation technique was used to avoid over-fitting and under-fitting during training phases.   

Additional Files

Published

2025-03-22

Issue

Section

Articles

How to Cite

[1]
“Early Prediction of Stroke Risk Using Machine Learning Approaches and Imbalanced Data”, NTU-JET, vol. 4, no. 1, Mar. 2025, doi: 10.56286/1vf19469.

Similar Articles

1-10 of 30

You may also start an advanced similarity search for this article.