Optimizing Lung Cancer Prediction Using Evaluating Classification Methods and Sampling Techniques
DOI:
https://doi.org/10.37034/medinftech.v1i1.4Keywords:
Lung Cancer, Classification, Sampling Techniques, Gboost, Level BasedAbstract
Lung cancer is an extremely aggressive type of cancer and one of the leading causes of death globally. The focus of this study is to improve the detection and prediction of lung cancer by evaluating different approaches for classification and sampling. The research utilizes a dataset comprising 1000 patients and 24 Attributes. The primary goal is to compare the effectiveness of classification methods like Logistic Regression, AdaBoost, and GradientBoosting, in conjunction with diverse sampling techniques such as Random Over-Sampling, Random
Under-Sampling, and SMOTE by Level Considering, for predicting lung cancer. The assessment metrics include
accuracy, precision, recall, and F1-score. The experimental findings demonstrate that Gradient Boosting (GBoost) attains flawless accuracy, precision, recall, and F1-score results of 100% when identifying lung cancer instances within the dataset. This highlights the effectiveness of GBoost in accurately predicting lung cancer occurrence. The findings of this research aim to contribute significantly to the development of more effective diagnostic and predictive methods for lung cancer.
Downloads
References
Singh, G.A.P. and Gupta, P.K., (2019). Performance analysis of various machine learning-based approaches
for detection and classification of lung cancer in humans. Neural Computing and Applications, [online] 31(10),
pp.6863–6877. https://doi.org/10.1007/s00521-018-3518-x.
S. Bharati, P. Podder, R. Mondal, A. Mahmood, and M. Raihan-Al-Masud, (2020). Comparative Performance
Analysis of Different Classification Algorithm for the Purpose of Prediction of Lung Cancer, vol. 941. Springer
International Publishing. doi: 10.1007/978-3-030-16660-1_44.
Guslovesmath, 2022. “Lung Cancer Prediction (ML)”. https://www.kaggle.com/code/guslovesmath/lungcancer-prediction-ml/input
E. Dritsas and M. Trigka, (2022). “Lung Cancer Risk Prediction with Machine Learning Models,” Big Data
Cogn. Comput., vol. 6, no. 4, doi: 10.3390/bdcc6040139.
Md, Abdul Quadir et al. 2023. “Enhanced Preprocessing Approach Using Ensemble Machine Learning
Algorithms for Detecting Liver Disease.” Biomedicines 11(2). doi: 10.3390/biomedicines11020581
D. Yadav, (2022). “Lung Cancer Prediction Using Supervised Ml Algorithms,” Int. Res. J. Mod. Eng.
Technol. Sci., no. 10, pp. 293–298, doi: 10.56726/irjmets30472.
D. Bansal, R. Chhikara, K. Khanna, and P. Gupta, (2018). “Comparative Analysis of Various Machine
Learning Algorithms for Detecting Dementia,” Procedia Comput. Sci., vol. 132, pp. 1497–1502, doi:
1016/j.procs.2018.05.102.
L.J.Muhammad, E. A.Algehyne, S. S. Usman, A. Ahmad, C. Chakraborty, and I.A.Mohammed,
“Supervised Machine Learning Models for Prediction of COVID‑19.pdf.” 2021, https://doi.org/10.1007/s42979-
-00394-7.
Ibrahim and A. Abdulazeez, (2021). “The Role of Machine Learning Algorithms for Diagnosing
Diseases,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 10–19, doi: 10.38094/jastt20179.
A. Singh and R. Kumar, "Heart Disease Prediction Using Machine Learning Algorithms," 2020
International Conference on Electrical and Electronics Engineering (ICE3), Gorakhpur, India, 2020, pp. 452-
, doi: 10.1109/ICE348803.2020.9122958.
S. Ruuska, W. Hämäläinen, S. Kajava, M. Mughal, P. Matilainen, and J. Mononen, “Evaluation of the
confusion matrix method in the validation of an automated system for measuring feeding behaviour of
cattle,” Behav. Processes, vol. 148, pp. 56–62, 2018, doi: 10.1016/j.beproc.2018.01.004.
A. Priya, S. Garg, and N. P. Tigga, (2020). “ScienceDirect ScienceDirect Predicting Anxiety, Depression and Stress in Modern Life using Predicting Anxiety , Depression and Stress in Modern Life using Machine Learning Algorithms Machine Learning Algorithms,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 1258–1267, doi: 10.1016/j.procs.2020.03.442.
Rikta, Sarreha Tasmin et al. 2023. “XML-GBM Lung: An Explainable Machine Learning-Based Application
for the Diagnosis of Lung Cancer.” Journal of Pathology Informatics 14(March). doi: 10.1016/j.jpi.2023.100307
Ding, Yi, Hongyang Zhu, Ruyun Chen, and Ronghui Li. 2022. “An Efficient AdaBoost Algorithm with the
Multiple Thresholds Classification.” Applied Sciences (Switzerland) 12(12). doi: 10.3390/app12125872
Ramakrishna, Mahesh Thyluru et al. 2023. “Homogeneous Adaboost Ensemble Machine Learning
Algorithms with Reduced Entropy on Balanced Data.” Entropy 25(2). doi:10.3390/e25020245
Geetha, R. et al. 2019. “Cervical Cancer Identification with Synthetic Minority Oversampling Technique and
PCA Analysis Using Random Forest Classifier.” Journal of Medical Systems 43(9). doi: 10.1007/s10916-019-
-6.
Wongvorachan, Tarid, Surina He, and Okan Bulut. 2023. “A Comparison of Undersampling, Oversampling, and
SMOTE Methods for Dealing with Imbalanced Classification in Educational Data Mining.” Information
(Switzerland) 14(1). doi: 10.3390/info14010054
Bentéjac, Candice, Anna Csörgő, and Gonzalo MartínezMuñoz. 2021. 54 Artificial Intelligence Review A
Comparative Analysis of Gradient Boosting Algorithms. Springer Netherlands. doi:10.1007/s10462-020-09896-5
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Journal Medical Informatics Technology

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.