A Comparative Study of Machine Learning Techniques in Heart Disease Detection

Authors

  • Praneeth Kumar T Department of CSE, BNM Institute of Technology, Bangalore, India
  • Sriram Praveen V A Department of CSE, RV College of Engineering, Bangalore, India
  • Rohan Maheshwari Department of CSE, RV College of Engineering, Bangalore, India
  • Sahana D Gowda Department of CSE, BNM Institute of Technology, Bangalore, India

DOI:

https://doi.org/10.5281/zenodo.4515583

Keywords:

CVD, Machine Learning, Feature Selection, Grid Search, K-Fold Validation, SVM, Decision Tree, Random Forest, KNN, Multi-Layer Perceptron, Gaussian Naive-Bayes, Bagging

Abstract

Cardiovascular diseases (CVD) are the world's leading cause of mortality accounting for an estimated 31% of all deaths worldwide. Out of 17.9 million deaths per year due to CVDs, three-fourths of these deaths have occurred as there are no systems in place to predict the occurrence of a heart attack and warn the patient or doctor to take appropriate action. Data generated by clinical reports and examination reports by doctors are available for prediction through ERP models. Data science and reliable algorithms powered by AI can be used to develop medical devices that can predict such incidents of CVDs. In this paper, seven common classifiers are implemented that are computationally inexpensive and easily implementable and their performance metrics are compared. Two feature selection techniques are implemented and Grid Search is used for hyper-parameter tuning. Using k-fold cross-validation, classifiers are then evaluated, which generates classification metrics such as accuracy, f1-score, recall, and precision. It is evident from the study that the combination of Random Forest Classifier and SelectKBest feature selector has the highest accuracy of 89.706% and precision of 89.655%.

Downloads

Download data is not yet available.

References

World Health Organization. 2021. Cardiovascular diseases. [Online] Available: https://www.who.int/health-topics/cardiovascular-diseases

Centers for Disease Control and Prevention. Underlying Cause of Death, 1999–2019. CDC WONDER Online Database. Atlanta, GA: Centers for Disease Control and Prevention [Online] Available: https://wonder.cdc.gov/ucd-icd10.html

Finegold JA, Shun-Shin MJ, Cole GD, et al. Distribution of lifespan gain from primary prevention intervention, Open Heart 2016; vol. 3, no. 1, e000343.

Dua, Dheeru and Graff, Casey, 2019, UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences [Online] Available: http://archive.ics.uci.edu/ml

Centers for Disease Control and Prevention, [Online] Available: https://chronicdata.cdc.gov/Heart-Disease-Stroke-Prevention/Heart-Disease-Mortality-Data-Among-US-Adults-35-by/i2vk-mgdh

Davide Chicco, and Giuseppe Jurman. "Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone." BMC Medical Informatics and Decision Making vol. 20, no. 1, pp. 16, Feb, 2020

Liu, Yan AND Wang et al. “Variations of CITED2 Are Associated with Congenital Heart Disease (CHD) in Chinese Population”, PLOS ONE, Vol. 9, May, 2014

P. Leijdekkers and V. Gay, "A Self-Test to Detect a Heart Attack Using a Mobile Phone and Wearable Sensors," 2008 21st IEEE International Symposium on Computer-Based Medical Systems, Jyvaskyla, 2008, pp. 93-98

Ravish, D. et al. “Heart function monitoring, prediction and prevention of Heart Attacks: Using Artificial Neural Networks.” 2014 International Conference on Contemporary Computing and Informatics (IC3I) (2014): 1-6.

Kathleen H. Miao, Julia H. Miao and George J. Miao, “Diagnosing Coronary Heart Disease using Ensemble Machine Learning” International Journal of Advanced Computer Science and Applications (IJACSA), 7(10), 2016.

Robert Detrano, “Cleveland Heart Disease Dataset”,Cleveland Clinic Foundation,UCI Machine Learning Repository [Online]. Available: http://archive.ics.uci.edu/ml/datasets/Heart+Disease

A. U. Haq, J. P. Li, M. H. Memon, S. Nazir, and R. Sun, “A Hybrid Intelligent System Framework for the Prediction of Heart Disease Using Machine Learning Algorithms,” Mobile Information Systems, Vol. 8, pg 1-21, June, 2015.

Hossam Meshref, “Cardiovascular Disease Diagnosis: A Machine Learning Interpretation Approach” International Journal of Advanced Computer Science and Applications (IJACSA), 10(12), 2019.

M. A. Jabbar and S. Samreen, "Heart disease prediction system based on hidden naïve bayes classifier," 2016 International Conference on Circuits, Controls, Communications and Computing (I4C), Bangalore, 2016, pp. 1-5

Statlog (Heart) Data Set, UCI repository, [Online] Available: http://archive.ics.uci.edu/ml/datasets/statlog+(heart)

Peter John,” Study and development of FSS for disease prediction”, IJSRP, Vol 2, Issue 10, (2012)

C. Thirumalai, A. Duba and R. Reddy, "Decision making system using machine learning and Pearson for heart attack," 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, 2017, pp. 206-210

L. Ali et al., "An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure," in IEEE Access, vol. 7, pp. 54007-54014, 2019

Yang, Hui & Garibaldi, Jonathan. (2015). A Hybrid Model for Automatic Identification of Risk Factors for Heart Disease. Journal of biomedical informatics, pp. 171-182, September, 2015

S. Rajamhoana, C. A. Devi, K. Umamaheswari, R. Kiruba, K. Karunya, and R. Deepika, "Analysis of Neural Networks Based Heart Disease Prediction System," 2018 11th International Conference on Human System Interaction (HSI), Vol. 9 Gdansk, pp. 233-239, March, 2020

Y. Li et al., "Combining Convolutional Neural Network and Distance Distribution Matrix for Identification of Congestive Heart Failure," in IEEE Access, vol. 6, pp. 39734-39744, 2018

S. Kaura, A. Chandel and N. K. Pal, "Heart Disease-Sinus arrhythmia prediction system by neural network using ECG analysis," 2019 2nd International Conference on Power Energy, Environment and Intelligent Control (PEEIC), Greater Noida, India, 2019, pp. 466-471,

E. O. Olaniyi, O. K. Oyedotun, and K. Adnan, “Heart diseases diagnosis using neural networks arbitration,” International Journal of Intelligent Systems and Applications, vol. 7, no. 12, p. 72, November, 2015.

A. Singh and R. Kumar, "Heart Disease Prediction Using Machine Learning Algorithms," 2020 International Conference on Electrical and Electronics Engineering (ICE3), Gorakhpur, India, 2020, pp. 452-457

Tibshirani, Robert. “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological), vol. 58, no. 1, pp. 267–288. 1996

Chin-Wei Hsu, Chih-Chung Chang and Chih-Jen Lin ,” A practical guide to support vector classification”, Technical Report, National Taiwan University. May, 2016

Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324

X. Wu and V. Kumar, Top 10 Algorithms in Data Mining, Springer, Berlin, Germany, vol. 1, pp 151-159 2007.

Scikit-learn: Multi-Layer Perceptron [Online] Available: https://scikit-learn.org/stable/modules/neural_networks_supervised.html

Friedman, N., Geiger, D. & Goldszmidt, M. Bayesian Network Classifiers. Machine Learning 29, 131–163 (1997).

Downloads

Published

2021-02-07

How to Cite

[1]
P. Kumar T, S. P. V A, R. . Maheshwari, and S. D. . Gowda, “A Comparative Study of Machine Learning Techniques in Heart Disease Detection”, pices, vol. 4, no. 10, pp. 264-272, Feb. 2021.

URN