Hepatitis Prediction Using K-NN, Naive Bayes, Support Vector Machine, Multilayer Perceptron and Random Forest, Gradient Boosting, K-Means

Authors

  • Heru Dwi Saputra Universitas Nusa Mandiri
  • Ade Irfan Efendi Efendi Universitas Nusa Mandiri
  • Edwin Rudini Universitas Nusa Mandiri
  • Dwiza Riana Universitas Nusa Mandiri
  • Alya Shafira Hewiz Universitas Airlangga

DOI:

https://doi.org/10.37034/medinftech.v1i4.21

Keywords:

Measuring Accuracy, Precision, Recall, ROC, Best Score

Abstract

Hepatitis is a serious disease that causes death throughout the world. It is responsible for inflammation in the human liver. If we manage to detect this life-threatening disease early, we can save many lives from it. In this research paper, we predict hepatitis disease using data mining techniques. We have attempted to propose a feasible approach to improve the performance of our prediction models in our research. We address the problem of missing values in the dataset by replacing them with the mean value. Nine algorithms were applied to the hepatitis disease dataset to calculate prediction accuracy. We measure accuracy, precision, recall, ROC and best score, and we compare them with random search hyperparameter tuning. It is hoped that by using them we will find the optimal combination of hyperparameters to improve the performance of machine learning models which helps us compare the performance of classification models.

Downloads

Download data is not yet available.

References

T. I. Trishna, S. U. Emon, R. R. Ema, G. I. H. Sajal, S. Kundu, and T. Islam, “Detection of Hepatitis (A, B, C and E) Viruses Based on Random Forest, K-nearest and Naïve Bayes Classifier,” in 2019 10th International Conference on Computing, Communication and Networking Technologies, ICCCNT 2019, Institute of Electrical and Electronics Engineers Inc., Jul. 2019. doi: 10.1109/ICCCNT45670.2019.8944455.

A. Rasche, A. L. Sander, V. M. Corman, and J. F. Drexler, “Evolutionary biology of human hepatitis viruses,” Journal of Hepatology, vol. 70, no. 3. Elsevier B.V., pp. 501–520, Mar. 01, 2019. doi: 10.1016/j.jhep.2018.11.010.

B. K. Bhardwaj and S. Pal, “Data Mining: A prediction for performance improvement using classification,” 2011.

N. Salmi and Z. Rustam, “Naïve Bayes Classifier Models for Predicting the Colon Cancer,” in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, Jul. 2019. doi: 10.1088/1757-899X/546/5/052068.

B. K. Bhardwaj and S. Pal, “Data Mining: A prediction for performance improvement using classification,” 2011.

M. J. Nayeem, S. Rana, F. Alam, and M. A. Rahman, “Prediction of Hepatitis Disease Using K-Nearest Neighbors, Naive Bayes, Support Vector Machine, Multi-Layer Perceptron and Random Forest,” in 2021 International Conference on Information and Communication Technology for Sustainable Development, ICICT4SD 2021 - Proceedings, Institute of Electrical and Electronics Engineers Inc., Feb. 2021, pp. 280–284. doi: 10.1109/ICICT4SD50815.2021.9397013.

N. Komal Kumar and D. Vigneswari, “Hepatitis- infectious disease prediction using classification algorithms,” Res J Pharm Technol, vol. 12, no. 8, pp. 3720–3725, Aug. 2019, doi: 10.5958/0974-360X.2019.00636X.

S. Hashem et al., “Comparison of Machine Learning Approaches for Prediction of Advanced Liver Fibrosis in Chronic Hepatitis C Patients,” IEEE/ACM Trans Comput Biol Bioinform, vol. 15, no. 3, pp. 861–868, May 2018, doi: 10.1109/TCBB.2017.2690848.

N. Nahar and F. Ara, “Liver Disease Prediction by Using Different Decision Tree Techniques,” International Journal of Data Mining & Knowledge Management Process, vol. 8, no. 2, pp. 01–09, Mar. 2018, doi: 10.5121/ijdkp.2018.8201.

S. M. M. Hasan et al., “Comparative Analysis of Classification Approaches for Heart Disease Prediction Data Security View project A Cryptographic Algorithm Based on ASCII and Number System Conversions along with a Cyclic Mathematical Function View project Comparative Analysis of Classification Approaches for Heart Disease Prediction.” [Online]. Available: https://www.researchgate.net/publication/334929171

N. Salmi and Z. Rustam, “Naïve Bayes Classifier Models for Predicting the Colon Cancer,” in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, Jul. 2019. doi: 10.1088/1757-899X/546/5/052068.

N. Ali, D. Neagu, and P. Trundle, “Evaluation of k-nearest neighbour classifier performance for heterogeneous data sets,” SN Appl Sci, vol. 1, no. 12, Dec. 2019, doi: 10.1007/s42452-019-1356-9.

K. S. Sahoo et al., “An Evolutionary SVM Model for DDOS Attack Detection in Software Defined Networks,” IEEE Access, vol. 8, pp. 132502–132513, 2020, doi: 10.1109/ACCESS.2020.3009733.

S. J. Lee et al., “A dimension-reduction based multilayer perception method for supporting the medical decision making,” Pattern Recognit Lett, vol. 131, pp. 15–22, Mar. 2020, doi: 10.1016/j.patrec.2019.11.026.

S. M. M. Hasan et al., “Comparative Analysis of Classification Approaches for Heart Disease Prediction Data Security View project A Cryptographic Algorithm Based on ASCII and Number System Conversions along with a Cyclic Mathematical Function View project Comparative Analysis of Classification Approaches for Heart Disease Prediction.” [Online]. Available: https://www.researchgate.net/publication/334929171

Downloads

Published

2023-12-31

How to Cite

[1]
H. Dwi Saputra, A. I. E. Efendi, E. Rudini, D. Riana, and A. S. Hewiz, “Hepatitis Prediction Using K-NN, Naive Bayes, Support Vector Machine, Multilayer Perceptron and Random Forest, Gradient Boosting, K-Means”, MEDINFTech, vol. 1, no. 4, pp. 96–100, Dec. 2023.

Issue

Section

Articles