Cardiovascular Disease Prediction Using A Custom Voting Ensemble Machine Learning Approach

Authors

  • K. Akilandeswari Author
  • N. Muthumani Author

DOI:

https://doi.org/10.64252/af7kre80

Keywords:

Cardiovascular Disease, Machine Learning, Ensemble Learning, Voting Classifier, Logistic Regression, Random Forest, XGBoost, Medical Prediction.

Abstract

Cardiovascular disease continues to pose a significant threat to global health. CVD contributes to a high proportion of the death toll and increases the burden on health care systems. There is a need for early recognition and accurate prediction for CVD to assist clinicians in timely implementations of interventions and further preventative care. In the research we describe a novel methodology for CVD prediction with a customized Voting Ensemble Machine Learning model. We created it to apply merge and ensemble learning principles by combining many classifiers in order to improve the accuracy of diagnosing someone. Because of the complexity of our health care systems today, the main impetus for the research was increasing demand for analytics-enabled, sophisticated decision-support tools. Many researchers have applied several types of machine learning models to predict heart disease, but the best model can only be improved by its specific limitations, which are inherent in classifiers that rely on mixed clinical data. To overcome the problem, the research proposes a soft voting ensemble of three, popular and interpretable machine learning models - Logistic Regression, Random Forest, and XG Boost - that allows for a relatable interpretation, robustness, and predictive quality. The Cleveland Heart Disease dataset, which has 297 patient records and clinic-provided data with 13 features (age, sex, type of chest pain, total cholesterol, resting blood pressure, etc.), and resting ECG information, was the dataset used to train the model. The data were preprocessed (cleaned), standardized, and feature scaled, and subsequently, a new model evaluation was completed with 80 percent of the data being used for training (modelling) and 20 percent allocated for testing. The purpose of the project is to provide and refine the prediction of risk of developing heart disease by creating a precise and effective ensemble based predictive modelling system. The research will produce a viable machine learning algorithm related to clinical data collation based on patients seen in standard daily practice and will be (technically viable or clinically usable) for a physician to determine patients who may be assessed for the risk of developing cardiovascular disease.

Downloads

Download data is not yet available.

Downloads

Published

2025-08-20

Issue

Section

Articles

How to Cite

Cardiovascular Disease Prediction Using A Custom Voting Ensemble Machine Learning Approach. (2025). International Journal of Environmental Sciences, 407-415. https://doi.org/10.64252/af7kre80