Comparison of Naive Bayes and Decision Tree Methods in Breast Cancer Classification

Authors

DOI:

https://doi.org/10.37034/medinftech.v3i4.112

Keywords:

Breast Cancer, Decision Tree, Early Diagnosis, Machine Learning, Naive Bayes

Abstract

The early diagnosis of breast cancer is a critical factor in improving recovery rates and reducing cancer-related mortality. This study aims to compare the performance of two widely used machine learning algorithms in medical data classification Naive Bayes and Decision Tree in detecting breast cancer using the Breast Cancer Wisconsin (Diagnostic) dataset. The dataset consists of 569 samples with 30 numerical features and one target label. The methodology includes data preprocessing, model training, and performance evaluation using six metrics: accuracy, precision, recall, F1-score, AUC, and MCC. Naive Bayes achieved higher performance, with 96.5% accuracy, 97.6% precision, 93.0% recall, 95.2% F1-score, 0.997 AUC, and 0.925 MCC, compared to Decision Tree with 93.9% accuracy, 90.9% precision, 93.0% recall, 92.0% F1-score, 0.936 AUC, and 0.87 MCC. Confusion matrix and ROC curve analyses support these results, particularly in minimizing classification errors. While Decision Tree offers better interpretability, Naive Bayes may be more suitable for early breast cancer detection under similar dataset conditions. Future studies could explore ensemble approaches to combine the strengths of both methods.

Downloads

Download data is not yet available.

References

J. S. Ahn et al., “Artificial Intelligence in Breast Cancer Diagnosis and Personalized Medicine,” J. Breast Cancer, vol. 26, no. 5, pp. 405–435, 2023, doi: 10.4048/jbc.2023.26.e45.

K. Kourou, T. P. Exarchos, K. P. Exarchos, M. V. Karamouzis, and D. I. Fotiadis, “Machine learning applications in cancer prognosis and prediction,” Comput. Struct. Biotechnol. J., vol. 13, pp. 8–17, 2015, doi: 10.1016/j.csbj.2014.11.005.

World Health Organization (WHO), “Breast cancer: Prevention and control,” WHO Fact Sheets, 2022. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/breast-cancer

L. Wang, “Mammography with deep learning for breast cancer detection,” Front. Oncol., vol. 14, no. February, pp. 1–16, 2024, doi: 10.3389/fonc.2024.1281922.

A. M. Sharafaddini, K. K. Esfahani, and N. Mansouri, “Deep learning approaches to detect breast cancer: a comprehensive review,” Multimed. Tools Appl., vol. 84, no. 25, pp. 24079–24190, doi: https://doi.org/10.1007/s11042-024-20011-6.

I. P. Ramadhani, Fanny; Al-Khowarizmi; Sari, “Improving the Performance of Naive Bayes Algorithm by Reducing the Attributes of Dataset Using Gain Ratio and Adaboost,” IEEE Xplore, 2021, doi: 10.1109/IC2SE52832.2021.9792027.

M. G. M. Lauande et al., “Classification of Histopathological Images of Penile Cancer using DenseNet and Transfer Learning,” Proc. Int. Jt. Conf. Comput. Vision, Imaging Comput. Graph. Theory Appl., vol. 4, pp. 976–983, 2022, doi: 10.5220/0010893500003124.

C. Shrimali, M. Baghel, and S. Rajput, “Performance Analysis of Decision Tree Algorithms for Breast Cancer Classification,” Indian J. Sci. Technol., vol. 8, no. 29, pp. 1–8, 2015, doi: 10.17485/ijst/2015/v8i29/84646.

M. Biswas et al., “State-of-the-art review on deep learning in medical imaging,” Front. Biosci. Landmark, vol. 24, no. 3, pp. 392–426, 2019, doi: https://doi.org/10.2741/4725.

H. Shah, S. Agrawal, P. Oza, and S. Tanwar, “Comparative Study on Machine Learning Algorithms for Breast Cancer Diagnosis,” Procedia Comput. Sci., vol. 259, pp. 1994–2003, 2025, doi: 10.1016/j.procs.2025.04.155.

J. Ma’touq and N. Alnuman, “Comparative analysis of features and classification techniques in breast cancer detection for Biglycan biomarker images,” Cancer Biomarkers, vol. 40, no. 3–4, pp. 263–273, 2024, doi: 10.3233/CBM-230544.

A. A. Balasubramanian et al., “Ensemble Deep Learning-Based Image Classification for Breast Cancer Subtype and Invasiveness Diagnosis from Whole Slide Image Histopathology,” Cancers (Basel)., vol. 16, no. 12, 2024, doi: 10.3390/cancers16122222.

R. R. Kadhim and M. Y. Kamil, “Comparison of machine learning models for breast cancer diagnosis,” IAES Int. J. Artif. Intell., vol. 12, no. 1, pp. 415–421, 2023, doi: 10.11591/ijai.v12.i1.pp415-421.

A. Ghasemi, S. Hashtarkhani, D. L. Schwartz, and A. Shaban-Nejad, “Explainable artificial intelligence in breast cancer detection and risk prediction: A systematic scoping review,” Cancer Innov., vol. 3, no. 5, pp. 1–22, 2024, doi: 10.1002/cai2.136.

T. Islam et al., “Predictive modeling for breast cancer classification in the context of Bangladeshi patients by use of machine learning approach with explainable AI,” Sci. Rep., vol. 14, no. 1, pp. 1–17, 2024, doi: 10.1038/s41598-024-57740-5.

V. Nemade, S. Pathak, and A. K. Dubey, “A Systematic Literature Review of Breast Cancer Diagnosis Using Machine Intelligence Techniques,” Arch. Comput. Methods Eng., vol. 29, no. 3, pp. 4401–4430, 2022, doi: 10.1007/s11831-022-09738-3.

A. Ara, Sharmin; Das, Annesha; Dey, “Malignant and Benign Breast Cancer Classification using Machine Learning Algorithms,” 2021. doi: https://doi.org/10.1109/ICAI52203.2021.9445249.

Q. Zhang, H. Zheng, Y. Gong, Z. Liu, and S. Chen, “Comparative Analysis of Bayesian Networks for Breast Cancer Classification: Naive Bayes vs. Tree-Augmented Naive Bayes,” 2024 4th Int. Conf. Electron. Inf. Eng. Comput. Sci. EIECS 2024, pp. 1078–1081, 2024, doi: 10.1109/EIECS63941.2024.10800362.

S. A. Mohammed, S. Darrab, S. A. Noaman, and G. Saake, “Analysis of breast cancer detection using different machine learning techniques,” in Data Mining and Big Data, 2020, pp. 108–117. doi: http://dx.doi.org/10.1007/978-981-15-7205-0_10.

D. C. Maryann, S. Dalal, E. M. Onyema, and P. Kumar, “A hybrid machine learning model for breast cancer prediction,” Int. J. Model. Simul. Sci. Comput., vol. 14, no. 4, 2023, doi: http://dx.doi.org/10.1142/S1793962323410234.

H. Asri, H. Mousannif, H. Al Moatassime, and T. Noel, “Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis,” Procedia Comput. Sci., vol. 83, no. Fams, pp. 1064–1069, 2016, doi: 10.1016/j.procs.2016.04.224.

C. W. Cahyana and A. Nurlayli, “Analisis Performa Logistic Regression, Naïve Bayes, dan Random Forest sebagai Algoritma Pendeteksi Kanker Payudara,” Inser. Inf. Syst. Emerg. Technol. J., vol. 4, no. 1, pp. 51–64, 2023, [Online]. Available: https://ejournal.undiksha.ac.id/index.php/insert/article/view/62362

D. Shah, S. Patel, and S. K. Bharti, “Heart Disease Prediction using Machine Learning Techniques,” SN Comput. Sci., vol. 1, no. 6, p. 345, 2020, doi: 10.1007/s42979-020-00365-y.

A. Kaur and S. Gupta, “Unveiling Precision in Breast Cancer Prediction with Random Forest and Decision Trees,” in 2024 5th International Conference on Smart Electronics and Communication (ICOSEC), 2024, pp. 1232–1236. doi: 10.1109/ICOSEC61587.2024.10722493.

S. Liu, J. Shen, and J. Zhang, “An integrated model combining BERT and tree-augmented naive Bayes for analyzing risk factors of construction accident,” Kybernetes, vol. 54, no. 10, pp. 5651–5675, May 2024, doi: 10.1108/K-08-2023-1605.

Downloads

Published

2025-12-31

How to Cite

[1]
D. N. Sulistyowati, S. Hadianti, and N. A. Mayangky, “Comparison of Naive Bayes and Decision Tree Methods in Breast Cancer Classification”, MEDINFTech, vol. 3, no. 4, pp. 143–148, Dec. 2025.

Issue

Section

Articles