Optimizing Lung Cancer Prediction Using Evaluating Classification Methods and Sampling Techniques

Dika Putri Metalica; Fahmi B Marasabessy

doi:10.37034/medinftech.v1i1.4

Authors

Dika Putri Metalica Universitas Nusa Mandiri Jakarta
Fahmi B Marasabessy Universitas Nusa Mandiri Jakarta

DOI:

https://doi.org/10.37034/medinftech.v1i1.4

Keywords:

Lung Cancer, Classification, Sampling Techniques, Gboost, Level Based

Abstract

Lung cancer is an extremely aggressive type of cancer and one of the leading causes of death globally. The focus of this study is to improve the detection and prediction of lung cancer by evaluating different approaches for classification and sampling. The research utilizes a dataset comprising 1000 patients and 24 Attributes. The primary goal is to compare the effectiveness of classification methods like Logistic Regression, AdaBoost, and GradientBoosting, in conjunction with diverse sampling techniques such as Random Over-Sampling, Random
Under-Sampling, and SMOTE by Level Considering, for predicting lung cancer. The assessment metrics include
accuracy, precision, recall, and F1-score. The experimental findings demonstrate that Gradient Boosting (GBoost) attains flawless accuracy, precision, recall, and F1-score results of 100% when identifying lung cancer instances within the dataset. This highlights the effectiveness of GBoost in accurately predicting lung cancer occurrence. The findings of this research aim to contribute significantly to the development of more effective diagnostic and predictive methods for lung cancer.

Downloads

Download data is not yet available.

References

Singh, G.A.P. and Gupta, P.K., (2019). Performance analysis of various machine learning-based approaches

for detection and classification of lung cancer in humans. Neural Computing and Applications, [online] 31(10),

pp.6863–6877. https://doi.org/10.1007/s00521-018-3518-x.

S. Bharati, P. Podder, R. Mondal, A. Mahmood, and M. Raihan-Al-Masud, (2020). Comparative Performance

Analysis of Different Classification Algorithm for the Purpose of Prediction of Lung Cancer, vol. 941. Springer

International Publishing. doi: 10.1007/978-3-030-16660-1_44.

Guslovesmath, 2022. “Lung Cancer Prediction (ML)”. https://www.kaggle.com/code/guslovesmath/lungcancer-prediction-ml/input

E. Dritsas and M. Trigka, (2022). “Lung Cancer Risk Prediction with Machine Learning Models,” Big Data

Cogn. Comput., vol. 6, no. 4, doi: 10.3390/bdcc6040139.

Md, Abdul Quadir et al. 2023. “Enhanced Preprocessing Approach Using Ensemble Machine Learning

Algorithms for Detecting Liver Disease.” Biomedicines 11(2). doi: 10.3390/biomedicines11020581

D. Yadav, (2022). “Lung Cancer Prediction Using Supervised Ml Algorithms,” Int. Res. J. Mod. Eng.

Technol. Sci., no. 10, pp. 293–298, doi: 10.56726/irjmets30472.

D. Bansal, R. Chhikara, K. Khanna, and P. Gupta, (2018). “Comparative Analysis of Various Machine

Learning Algorithms for Detecting Dementia,” Procedia Comput. Sci., vol. 132, pp. 1497–1502, doi:

1016/j.procs.2018.05.102.

L.J.Muhammad, E. A.Algehyne, S. S. Usman, A. Ahmad, C. Chakraborty, and I.A.Mohammed,

“Supervised Machine Learning Models for Prediction of COVID‑19.pdf.” 2021, https://doi.org/10.1007/s42979-

-00394-7.

Ibrahim and A. Abdulazeez, (2021). “The Role of Machine Learning Algorithms for Diagnosing

Diseases,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 10–19, doi: 10.38094/jastt20179.

A. Singh and R. Kumar, "Heart Disease Prediction Using Machine Learning Algorithms," 2020

International Conference on Electrical and Electronics Engineering (ICE3), Gorakhpur, India, 2020, pp. 452-

, doi: 10.1109/ICE348803.2020.9122958.

S. Ruuska, W. Hämäläinen, S. Kajava, M. Mughal, P. Matilainen, and J. Mononen, “Evaluation of the

confusion matrix method in the validation of an automated system for measuring feeding behaviour of

cattle,” Behav. Processes, vol. 148, pp. 56–62, 2018, doi: 10.1016/j.beproc.2018.01.004.

A. Priya, S. Garg, and N. P. Tigga, (2020). “ScienceDirect ScienceDirect Predicting Anxiety, Depression and Stress in Modern Life using Predicting Anxiety , Depression and Stress in Modern Life using Machine Learning Algorithms Machine Learning Algorithms,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 1258–1267, doi: 10.1016/j.procs.2020.03.442.

Rikta, Sarreha Tasmin et al. 2023. “XML-GBM Lung: An Explainable Machine Learning-Based Application

for the Diagnosis of Lung Cancer.” Journal of Pathology Informatics 14(March). doi: 10.1016/j.jpi.2023.100307

Ding, Yi, Hongyang Zhu, Ruyun Chen, and Ronghui Li. 2022. “An Efficient AdaBoost Algorithm with the

Multiple Thresholds Classification.” Applied Sciences (Switzerland) 12(12). doi: 10.3390/app12125872

Ramakrishna, Mahesh Thyluru et al. 2023. “Homogeneous Adaboost Ensemble Machine Learning

Algorithms with Reduced Entropy on Balanced Data.” Entropy 25(2). doi:10.3390/e25020245

Geetha, R. et al. 2019. “Cervical Cancer Identification with Synthetic Minority Oversampling Technique and

PCA Analysis Using Random Forest Classifier.” Journal of Medical Systems 43(9). doi: 10.1007/s10916-019-

-6.

Wongvorachan, Tarid, Surina He, and Okan Bulut. 2023. “A Comparison of Undersampling, Oversampling, and

SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining.” Information

(Switzerland) 14(1). doi: 10.3390/info14010054

Bentéjac, Candice, Anna Csörgő, and Gonzalo MartínezMuñoz. 2021. 54 Artificial Intelligence Review A

Comparative Analysis of Gradient Boosting Algorithms. Springer Netherlands. doi:10.1007/s10462-020-09896-5

Optimizing Lung Cancer Prediction Using Evaluating Classification Methods and Sampling Techniques

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

License

Cover

Partner:

Supported: