A Machine Learning Model for the Prediction of Heart Attack Risk in High-Risk Patients Utilizing Real-World Data


  • Ridwan B. Marqas Department of Computer Science, College of Science, Nawroz University, Duhok, Iraq.
  • Abdulazeez Mousa Department of Computer Science, College of Science, University, Duhok, Iraq
  • Fatih Özyurt Department of Software Engineering, College of Engineering, Firat University, Elazig, Turkey
  • Rojhat Salih Department of Computer Science, College of Science, University, Duhok, Iraq




Heart disease is a significant global public health concern that impacts a vast number of individuals worldwide. The early identification of patients at risk of heart attack can significantly reduce mortality rates. In this research study, we employed machine learning methods to develop a model that predicts the likelihood of a heart attack. To create the model, we collected a real-world dataset of patient features, including demographic information, medical history, and lifestyle factors. We pre-processed the data to eliminate any missing values and standardized the features to ensure uniformity across the dataset. Additionally, we utilized feature engineering techniques to identify the most significant factors that contribute to the development of heart attacks. We evaluated several machine learning algorithms such as logistic regression, decision trees, and random forest to identify the most effective ones based on traditional metrics including accuracy, precision, recall, F1-score, Mathew correlation, ROC, and AUC. Our algorithm produced highly accurate predictions for heart attack risk. Our results demonstrate that machine learning algorithms can effectively predict heart attacks and identify high-risk patients. The model can be integrated into electronic health records to facilitate prompt identification and intervention by healthcare providers. However, our study has limitations that need to be addressed, including the requirement for validation on a larger and more diverse dataset as well as the challenge of interpreting the model. Future research may incorporate additional data sources, advanced machine-learning techniques, and improved model interpretability. Our heart attack prediction model holds significant potential as a valuable tool for healthcare practitioners to identify high-risk patients and decrease heart attack rates.


Download data is not yet available.


Rahman, A. U., Saeed, M., Saeed, M. H., et al. (2023). A framework for susceptibility analysis of brain tumours based on uncertain analytical cum algorithmic modeling. Bioengineering, 10(2), 147.

C. O. S. Patricia, World health statistics 2021, vol. 3, no. 2. 2021.

Asaad, R. R. (2022). Support vector machine classification learning algorithm for diabetes prediction. International Research Journal of Science, Technology, Education, and Management, 2(2), 26-34.

D. for H. D. and S. P. National Center for Chronic Disease Prevention and Health Promotion, “Heart Disease Facts,” 2021. https://www.cdc.gov/heartdisease/facts.htm#:~:text=Coronary heart disease is the,killing 375%2C476 people in 2021.&text=About 1 in 20 adults,have CAD (about 5%25).&text=In 2021%2C about 2 in,less than 65 years old.

M. N. Krishnan, “Coronary heart disease and risk factors in India - On the brink of an epidemic?,” Indian Heart J., vol. 64, no. 4, pp. 364–367, 2012, doi: 10.1016/j.ihj.2012.07.001.

Thirugnanam, T., Galety, M. G., Pradhan, M. R., Agrawal, R., Shobanadevi, A., Almufti, S. M., & Lakshmana Kumar, R. (2023). PIRAP: Medical Cancer Rehabilitation Healthcare Center Data Maintenance Based on IoT-Based Deep Federated Collaborative Learning. International Journal of Cooperative Information Systems, 2350005.

L. W. & Wilkins, “Heart Disease and Stroke Statistics—2023 Update: A Report From the American Heart Association,” AHA/ASA Journals, vol. 147, no. 8, doi: https://doi.org/10.1161/CIR.0000000000001123.

Rukhsar, S., Awan, M. J., Naseem, U., et al. (2023). Artificial Intelligence Based Sentence Level Sentiment Analysis of COVID-19. Computer Systems Science & Engineering, 47(1).

Aighuraibawi, A. H. B., Manickam, S., Abdullah, R., et al. (2023). Feature Selection for Detecting ICMPv6-Based DDoS Attacks Using Binary Flower Pollination Algorithm. Computer Systems Science & Engineering, 47(1).

Ali, Z. A., Abduljabbar, Z. H., Taher, H. A., Sallow, A. B., & Almufti, S. M. (2023). Exploring the Power of eXtreme Gradient Boosting Algorithm in Machine Learning: a Review. Academic Journal of Nawroz University, 12(2), 320-334.

Mohammed, M. A., Lakhan, A., Zebari, D. A., Abdulkareem, K. H., Nedoma, J., Martinek, R., ... & Tiwari, P. (2023). Adaptive secure malware efficient machine learning algorithm for healthcare data. CAAI Transactions on Intelligence Technology.

Asaad, R. R. (2021). Review on Deep Learning and Neural Network Implementation for Emotions Recognition. Qubahan Academic Journal, 1(1), 1-4.

C. Chen et al., “Deep Learning for Cardiac Image Segmentation: A Review,” Front. Cardiovasc. Med., vol. 7, no. March, 2020, doi: 10.3389/fcvm.2020.00025.

Krittanawong, C., Johnson, K. W., Rosenson, R. S., & Wang, Z. (2020). Deep learning for cardiovascular medicine: A practical primer. European Heart Journal, 41(22), 2058-2073. DOI: 10.1093/eurheartj/ehaa552. .

Attia, Z. I., Kapa, S., Lopez-Jimenez, F., & Noseworthy, P. A. (2019). An artificial intelligence-enabled ECG algorithm for the identification of patients with atrial fibrillation during sinus rhythm: A retrospective analysis of outcome prediction. The Lancet, 394(10201), 861-867. DOI: 10.1016/S0140-6736(19)31721-0.

X. Su et al., “Prediction for cardiovascular diseases based on laboratory data: An analysis of random forest model,” J. Clin. Lab. Anal., vol. 34, no. 9, pp. 1–10, 2020, doi: 10.1002/jcla.23421.

X. Fan, Z. Hu, R. Wang, L. Yin, Y. Li, and Y. Cai, “A novel hybrid network of fusing rhythmic and morphological features for atrial fibrillation detection on mobile ECG signals,” Neural Comput. Appl., vol. 32, no. 12, pp. 8101–8113, 2020, doi: 10.1007/s00521-019-04318-2.

S. Ahmadian, S. M. J. Jalali, S. Raziani, and A. Chalechale, “An efficient cardiovascular disease detection model based on multilayer perceptron and moth-flame optimization,” Expert Syst., vol. 39, no. 4, pp. 1–19, 2022, doi: 10.1111/exsy.12914.

Rajab Asaad, R., & Masoud Abdulhakim, R. (2021). The Concept of Data Mining and Knowledge Extraction Techniques. Qubahan Academic Journal, 1 (2), 17–20.

R. Alizadehsani et al., “A database for using machine learning and data mining techniques for coronary artery disease diagnosis,” Sci. Data, vol. 6, no. 1, pp. 1–13, 2019, doi: 10.1038/s41597-019-0206-3.

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21-27.

Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning (Vol. 1). MIT press Cambridge.

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785-794).

Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine learning, 63(1), 3-42.

Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.

Bottou, L., Curtis, F. E., & Nocedal, J. (2018). Optimization methods for large-scale machine learning. SIAM Review, 60(2), 223-311.

Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148-156).

Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. CRC press.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of statistics, 29(5), 1189-1232.

Ibrahim, D. A., Zebari, D. A., Mohammed, H. J., & Mohammed, M. A. (2022). Effective hybrid deep learning model for COVID‐19 patterns identification using CT images. Expert Systems, 39(10), e13010.

Kapoor, N. R., Kumar, A., Kumar, A., et al. (2022). Event-Specific Transmission Forecasting of SARS-CoV-2 in a Mixed-Mode Ventilated Office Room Using an ANN. International Journal of Environmental Research and Public Health, 19(24), 16862.

Mohammed, H. J., Al-Fahdawi, S., Al-Waisy, A. S., et al. (2022) ReID-DeePNet: A Hybrid Deep Learning System for Person Re-Identification. Mathematics, 2022, 10, 3530.



How to Cite

B. Marqas, R., Mousa, A. ., Özyurt, F., & Salih, R. (2023). A Machine Learning Model for the Prediction of Heart Attack Risk in High-Risk Patients Utilizing Real-World Data. Academic Journal of Nawroz University, 12(4), 286–301. https://doi.org/10.25007/ajnu.v12n4a1974




Most read articles by the same author(s)