Optimal Prediction in the Diagnosis of Existing Heart Diseases using Machine Learning: Outlier Data Strategies
Subject Areas : International Journal of Decision IntelligenceOmid Rahmani 1 , Seyyed Amir Mahdi Ghoreishi Zadeh 2 , Mostafa Setak 3
1 - M.Sc. Student in Engineering, Industrial Engineering Majoring In Healthcare Systems, K. N. Toosi University, Tehran, Iran
2 - M.Sc. Student in Industrial Engineering Majoring In Macro Engineering systems, K. N. Toosi University, Tehran, Iran
3 - Associate Professor, Department of Industrial Engineering, Economic and Social Systems, K. N. Toosi,Tehran,Iran
Keywords: Decision tree, Heart disease, Naïve Bayes' Classifier, Support Vector Classifier, Winsorized and Logarithmic transformation methods, Wrapper and Embedded methods,
Abstract :
Heart disease is a prevalent and life-threatening condition that poses significant challenges to healthcare systems worldwide. Accurate and timely diagnosis of heart disease is crucial for effective treatment and patient management. In recent years, machine learning algorithms have emerged as powerful tools for predicting and identifying individuals at risk of heart disease. This article highlights the importance of heart disease diagnosis and explores the potential of machine learning algorithms in enhancing the diagnosis of heart disease accuracy. This article presents a study to develop a model for predicting heart disease in the Cleveland patient dataset. The innovation of this research involved identifying and handling outlier data using Winsorized and Logarithmic transformation methods. We also used Wrapper and Embedded methods to determine the most critical features for diagnosing heart disease. In addition to the usual features, Exercise-induced angina and No. of major vessels were found to be important. We then compared the performance of four machine learning algorithms, including KNN, Naïve Bayes' Classifier, Decision Tree, and Support Vector Classifier to determine the best algorithm for predicting heart disease. The findings showed that the Decision Tree algorithm had the best performance with an accuracy of 97.95%.