A Comparative Study of Machine Learning Techniques in Heart Disease Detection
DOI:
https://doi.org/10.5281/zenodo.4515583Keywords:
CVD, Machine Learning, Feature Selection, Grid Search, K-Fold Validation, SVM, Decision Tree, Random Forest, KNN, Multi-Layer Perceptron, Gaussian Naive-Bayes, BaggingAbstract
Cardiovascular diseases (CVD) are the world's leading cause of mortality accounting for an estimated 31% of all deaths worldwide. Out of 17.9 million deaths per year due to CVDs, three-fourths of these deaths have occurred as there are no systems in place to predict the occurrence of a heart attack and warn the patient or doctor to take appropriate action. Data generated by clinical reports and examination reports by doctors are available for prediction through ERP models. Data science and reliable algorithms powered by AI can be used to develop medical devices that can predict such incidents of CVDs. In this paper, seven common classifiers are implemented that are computationally inexpensive and easily implementable and their performance metrics are compared. Two feature selection techniques are implemented and Grid Search is used for hyper-parameter tuning. Using k-fold cross-validation, classifiers are then evaluated, which generates classification metrics such as accuracy, f1-score, recall, and precision. It is evident from the study that the combination of Random Forest Classifier and SelectKBest feature selector has the highest accuracy of 89.706% and precision of 89.655%.
Downloads
References
World Health Organization. 2021. Cardiovascular diseases. [Online] Available: https://www.who.int/health-topics/cardiovascular-diseases
Centers for Disease Control and Prevention. Underlying Cause of Death, 1999–2019. CDC WONDER Online Database. Atlanta, GA: Centers for Disease Control and Prevention [Online] Available: https://wonder.cdc.gov/ucd-icd10.html
Finegold JA, Shun-Shin MJ, Cole GD, et al. Distribution of lifespan gain from primary prevention intervention, Open Heart 2016; vol. 3, no. 1, e000343.
Dua, Dheeru and Graff, Casey, 2019, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences [Online] Available: http://archive.ics.uci.edu/ml
Centers for Disease Control and Prevention, [Online] Available: https://chronicdata.cdc.gov/Heart-Disease-Stroke-Prevention/Heart-Disease-Mortality-Data-Among-US-Adults-35-by/i2vk-mgdh
Davide Chicco, and Giuseppe Jurman. "Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone." BMC Medical Informatics and Decision Making vol. 20, no. 1, pp. 16, Feb, 2020
Liu, Yan AND Wang et al. “Variations of CITED2 Are Associated with Congenital Heart Disease (CHD) in Chinese Population”, PLOS ONE, Vol. 9, May, 2014
P. Leijdekkers and V. Gay, "A Self-Test to Detect a Heart Attack Using a Mobile Phone and Wearable Sensors," 2008 21st IEEE International Symposium on Computer-Based Medical Systems, Jyvaskyla, 2008, pp. 93-98
Ravish, D. et al. “Heart function monitoring, prediction and prevention of Heart Attacks: Using Artificial Neural Networks.” 2014 International Conference on Contemporary Computing and Informatics (IC3I) (2014): 1-6.
Kathleen H. Miao, Julia H. Miao and George J. Miao, “Diagnosing Coronary Heart Disease using Ensemble Machine Learning” International Journal of Advanced Computer Science and Applications (IJACSA), 7(10), 2016.
Robert Detrano, “Cleveland Heart Disease Dataset”,Cleveland Clinic Foundation,UCI Machine Learning Repository [Online]. Available: http://archive.ics.uci.edu/ml/datasets/Heart+Disease
A. U. Haq, J. P. Li, M. H. Memon, S. Nazir, and R. Sun, “A Hybrid Intelligent System Framework for the Prediction of Heart Disease Using Machine Learning Algorithms,” Mobile Information Systems, Vol. 8, pg 1-21, June, 2015.
Hossam Meshref, “Cardiovascular Disease Diagnosis: A Machine Learning Interpretation Approach” International Journal of Advanced Computer Science and Applications (IJACSA), 10(12), 2019.
M. A. Jabbar and S. Samreen, "Heart disease prediction system based on hidden naïve bayes classifier," 2016 International Conference on Circuits, Controls, Communications and Computing (I4C), Bangalore, 2016, pp. 1-5
Statlog (Heart) Data Set, UCI repository, [Online] Available: http://archive.ics.uci.edu/ml/datasets/statlog+(heart)
Peter John,” Study and development of FSS for disease prediction”, IJSRP, Vol 2, Issue 10, (2012)
C. Thirumalai, A. Duba and R. Reddy, "Decision making system using machine learning and Pearson for heart attack," 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, 2017, pp. 206-210
L. Ali et al., "An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure," in IEEE Access, vol. 7, pp. 54007-54014, 2019
Yang, Hui & Garibaldi, Jonathan. (2015). A Hybrid Model for Automatic Identification of Risk Factors for Heart Disease. Journal of biomedical informatics, pp. 171-182, September, 2015
S. Rajamhoana, C. A. Devi, K. Umamaheswari, R. Kiruba, K. Karunya, and R. Deepika, "Analysis of Neural Networks Based Heart Disease Prediction System," 2018 11th International Conference on Human System Interaction (HSI), Vol. 9 Gdansk, pp. 233-239, March, 2020
Y. Li et al., "Combining Convolutional Neural Network and Distance Distribution Matrix for Identification of Congestive Heart Failure," in IEEE Access, vol. 6, pp. 39734-39744, 2018
S. Kaura, A. Chandel and N. K. Pal, "Heart Disease-Sinus arrhythmia prediction system by neural network using ECG analysis," 2019 2nd International Conference on Power Energy, Environment and Intelligent Control (PEEIC), Greater Noida, India, 2019, pp. 466-471,
E. O. Olaniyi, O. K. Oyedotun, and K. Adnan, “Heart diseases diagnosis using neural networks arbitration,” International Journal of Intelligent Systems and Applications, vol. 7, no. 12, p. 72, November, 2015.
A. Singh and R. Kumar, "Heart Disease Prediction Using Machine Learning Algorithms," 2020 International Conference on Electrical and Electronics Engineering (ICE3), Gorakhpur, India, 2020, pp. 452-457
Tibshirani, Robert. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological), vol. 58, no. 1, pp. 267–288. 1996
Chin-Wei Hsu, Chih-Chung Chang and Chih-Jen Lin ,” A practical guide to support vector classification”, Technical Report, National Taiwan University. May, 2016
Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324
X. Wu and V. Kumar, Top 10 Algorithms in Data Mining, Springer, Berlin, Germany, vol. 1, pp 151-159 2007.
Scikit-learn: Multi-Layer Perceptron [Online] Available: https://scikit-learn.org/stable/modules/neural_networks_supervised.html
Friedman, N., Geiger, D. & Goldszmidt, M. Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997).
Downloads
Published
How to Cite
Issue
Section
URN
License
Copyright (c) 2021 Perspectives in Communication, Embedded-systems and Signal-processing - PiCES
This work is licensed under a Creative Commons Attribution 4.0 International License.