Identification of Cell Images in Pap Smear Using GLCM and Classification Methods in Machine Learning
DOI:
https://doi.org/10.37034/medinftech.v3i3.85Keywords:
Cervical Cancer, Gray-Level Co-occurrence Matrix, Machine Learning, Pap Smear Classification, Texture Feature ExtractionAbstract
Early detection of cervical cancer is critical for improving patient outcomes, and accurate classification of Pap smear images supports clinical decision-making. This study aimed to improve cervical cancer diagnosis by classifying Pap smear images using texture features. A dataset of 250 images across five classes underwent preprocessing including grayscale conversion and noise removal. Texture features such as contrast, dissimilarity, homogeneity, energy, correlation, and Angular Second Moment (ASM) were extracted using the Gray-Level Co-occurrence Matrix (GLCM). These features were then used to train and evaluate machine learning algorithms: Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), and Neural Networks (NN). The Decision Tree model achieved the highest accuracy of 95%, outperforming Neural Networks which reached 74%. Ensemble methods like RF and GB showed robust performance across classes. These results demonstrate the effectiveness of GLCM-based feature extraction combined with Decision Tree classification for accurate and reliable Pap smear image analysis. This approach offers valuable insights for enhancing clinical decision support in cervical cancer diagnosis.
Downloads
References
H. Lee, S. Yoon, S. Yang, W. Kim, H. Ryu, C. Jung, et al., "Prediction of acute kidney injury after liver transplantation: machine learning approaches vs. logistic regression model," Journal of Clinical Medicine, vol. 7, no. 11, p. 428, 2018. doi: 10.3390/jcm7110428.
K. Ellis, J. Kerr, S. Godbole, G. Lanckriet, D. Wing, and S. Marshall, "A random forest classifier for the prediction of energy expenditure and type of physical activity from wrist and hip accelerometers," Physiological Measurement, vol. 35, no. 11, pp. 2191-2203, 2014. doi: 10.1088/0967-3334/35/11/2191.
Y. Fan, J. Dong, Y. Wu, M. Shen, S. Zhu, X. He, et al., "Development of machine learning models for mortality risk prediction after cardiac surgery," Cardiovascular Diagnosis and Therapy, vol. 12, no. 1, pp. 12-23, 2022. doi: 10.21037/cdt-21-648.
S. Park, C. Kim, and X. Wu, "Development and validation of an insulin resistance predicting model using a machine-learning approach in a population-based cohort in korea," Diagnostics, vol. 12, no. 1, p. 212, 2022. doi: 10.3390/diagnostics12010212.
S. Weng, J. Reps, J. Kai, J. Garibaldi, and N. Qureshi, "Can machine-learning improve cardiovascular risk prediction using routine clinical data?," Plos One, vol. 12, no. 4, p. e0174944, 2017. doi: 10.1371/journal.pone.0174944.
U. Benedetto, S. Sinha, M. Lyon, A. Dimagli, T. Gaunt, G. Angelini, et al., "Can machine learning improve mortality prediction following cardiac surgery?," European Journal of Cardio-Thoracic Surgery, vol. 58, no. 6, pp. 1130-1136, 2020. doi: 10.1093/ejcts/ezaa229.
Y. Shen, L. Wang, W. Jian, J. Shang, X. Wang, L. Ju, et al., "Big-data and artificial-intelligence-assisted vault prediction and evo-icl size selection for myopia correction," British Journal of Ophthalmology, vol. 107, no. 2, pp. 201-206, 2021. doi: 10.1136/bjophthalmol-2021-319618.
J. Feng, J. Ye, G. Qi, L. Hong, F. Wang, S. Liu, et al., "A comparative analysis of eight machine learning models for the prediction of lateral lymph node metastasis in patients with papillary thyroid carcinoma," Frontiers in Endocrinology, vol. 13, 2022. doi: 10.3389/fendo.2022.1004913.
D. Na, M. Zhai, L. Zhao, and C.-H. Wu, "Cervical cell classification based on the CART feature selection algorithm," Journal of Ambient Intelligence and Humanized Computing, 2021. doi: 10.1007/S12652-020-02256-9.
A. Romagnoni, S. Jégou, K. Steen, G. Wainrib, J. Hugot, L. Peyrin–Biroulet, et al., "Comparative performances of machine learning methods for classifying crohn disease patients using genome-wide genotyping data," Scientific Reports, vol. 9, no. 1, 2019. doi: 10.1038/s41598-019-46649-z.
D. Wang, W. Xu, S. Wang, S. Wang, W. Leng, L. Fu, et al., "Lupus nephritis or not? a simple and clinically friendly machine learning pipeline to help diagnosis of lupus nephritis," Inflammation Research, vol. 72, no. 6, pp. 1315-1324, 2023. doi: 10.1007/s00011-023-01755-7.
H. Salah and S. Srinivas, "Explainable machine learning framework for predicting long-term cardiovascular disease risk among adolescents," Scientific Reports, vol. 12, no. 1, 2022. doi: 10.1038/s41598-022-25933-5.
Y. Du, A. Rafferty, F. McAuliffe, J. Mehegan, and C. Mooney, "Towards an explainable clinical decision support system for large-for-gestational-age births," Plos One, vol. 18, no. 2, p. e0281821, 2023. doi: 10.1371/journal.pone.0281821.
J. Zhu, J. Zheng, L. Li, R. Huang, H. Ren, D. Wang, et al., "Application of machine learning algorithms to predict central lymph node metastasis in t1-t2, non-invasive, and clinically node negative papillary thyroid carcinoma," Frontiers in Medicine, vol. 8, 2021. doi: 10.3389/fmed.2021.635771.
J. Park, T. Hsu, J. Hu, C. Chen, W. Hsu, M. Lee, et al., "Predicting sepsis mortality in a population-based national database: machine learning approach," Journal of Medical Internet Research, vol. 24, no. 4, p. e29982, 2022. doi: 10.2196/29982.
S. Ou, K. Lee, M. Tsai, W. Tseng, F. Lee, and D. Tarng, "Artificial intelligence for risk prediction of rehospitalization with acute kidney injury in sepsis survivors," Journal of Personalized Medicine, vol. 12, no. 1, p. 43, 2022. doi: 10.3390/jpm12010043.
H. Wang, C. Liu, and L. Deng, "Enhanced prediction of hot spots at protein-protein interfaces using extreme gradient boosting," Scientific Reports, vol. 8, no. 1, 2018. doi: 10.1038/s41598-018-32511-1.
D. Riana, S. Hadianti, S. Rahayu, Frieyadie, M. Hasan, I. N. Karimah, and R. Pratama, "Repomedunm: A new dataset for feature extraction and training of deep learning network for classification of pap smear images," in International Conference on Neural Information Processing, Cham: Springer International Publishing, 2021, pp. 317–325.
D. Riana, S. Hadianti, S. Rahayu, F. Aziz, and O. Kalsoem, "DEEPREPOMEDUNM: A Train Deep Learning Network and Extraction Feature for the Classification of Pap Smear Images," Journal of Theoretical and Applied Information Technology, vol. 100, no. 19, pp. 5787–5800, 2022.
M. Mahendra, J. Jumadi, and D. Riana, "Cervical cancer papsmear classification through meta-learning technique using convolution neural networks," Journal Medical Informatics Technology, vol. 1, no. 4, pp. 105–108, 2023. doi: 10.37034/medinftech.v1i4.23.
A. D. Purwanto, K. Wikantika, A. Deliar, and S. Darmawan, "Decision tree and random forest classification algorithms for mangrove forest mapping in Sembilang National Park, Indonesia," Remote Sensing, vol. 15, no. 1, p. 16, 2023. doi: 10.3390/rs15010016.
X. Fu, Y. Chen, J. Yan, Y. Chen, and F. Xu, "BGRF: A broad granular random forest algorithm," Journal of Intelligent & Fuzzy Systems, vol. 44, no. 5, pp. 8103-8117, 2023.
S. Amini, S. Homayouni, A. Safari, and A. A. Darvishsefat, "Object-based classification of hyperspectral data using Random Forest algorithm," Geo-spatial information science, vol. 21, no. 2, pp. 127-138, 2018.
B. Balnarsaiah, T. Prasad, and P. Laxminarayana, "Pixel-Based SAR Image Classification Using Random Forest Algorithm," International Journal of Innovative Technology and Exploring Engineering, vol. 8, no. 10, pp. 4351-4356, 2019.
D. Jollyta, G. Gusrianty, and D. Sukrianto, "Analysis of Slow Moving Goods Classification Technique: Random Forest and Naïve Bayes," Khazanah Informatika: Jurnal Ilmu Komputer dan Informatika, vol. 5, no. 2, pp. 134-139, 2019.
K. Maurya, S. Mahajan, and N. Chaube, "Remote sensing techniques: Mapping and monitoring of mangrove ecosystem—A review," Complex & Intelligent Systems, vol. 7, no. 6, pp. 2797-2818, 2021.
C. Ma, B. Ai, J. Zhao, X. Xu, and W. Huang, "Change detection of mangrove forests in coastal guangdong during the past three decades based on remote sensing data," Remote Sensing, vol. 11, no. 8, p. 921, 2019. doi: 10.3390/rs11080921.
A. N. Oo and T. Naing, "Decision tree models for medical diagnosis," International Journal of Trend in Scientific Research and Development, vol. 3, no. 3, pp. 1697-1699, 2019. doi: 10.31142/ijtsrd23510.
K. Maurya, S. Mahajan, and N. Chaube, "Remote sensing techniques: mapping and monitoring of mangrove ecosystem—a review," Complex & Intelligent Systems, vol. 7, no. 6, pp. 2797-2818, 2021. doi: 10.1007/s40747-021-00457-z.
R. K. Gupta, J. Manhas, and M. Kour, "Hybrid feature extraction based ensemble classification model to diagnose oral carcinoma using histopathological images," Journal of Scientific Research, vol. 66, no. 03, pp. 219-226, 2022. doi: 10.37398/jsr.2022.660327.
F. Arnia, K. Saddami, and K. Munadi, "Dcnet: noise-robust convolutional neural networks for degradation classification on ancient documents," Journal of Imaging, vol. 7, no. 7, p. 114, 2021. doi: 10.3390/jimaging7070114.
Z. Rustam, F. Zhafarina, G. S. Saragih, and S. Hartini, "Pancreatic cancer classification using logistic regression and random forest," IAES International Journal of Artificial Intelligence (IJ-AI), vol. 10, no. 2, p. 476, 2021. doi: 10.11591/ijai.v10.i2.pp476-481.
Z. Yu, C. Zhang, N. Xiong, and F. Chen, "A new random forest applied to heavy metal risk assessment," Computer Systems Science and Engineering, vol. 40, no. 1, pp. 207-221, 2022. doi: 10.32604/csse.2022.018301.
S. Shafei, H. Vahdati, T. Sedghi, and A. Charmin, "Novel high level retrieval system based on mathematic algorithm & technique for mri medical imaging and classification," Journal of Instrumentation, vol. 16, no. 07, p. P07055, 2021. doi: 10.1088/1748-0221/16/07/p07055.
N. H. Alkurdy, H. K. Aljobouri, and Z. K. Wadi, "Ultrasound renal stone diagnosis based on convolutional neural network and vgg16 features," International Journal of Electrical and Computer Engineering (IJECE), vol. 13, no. 3, p. 3440, 2023. doi: 10.11591/ijece.v13i3.pp3440-3448.
A. Wang, Y. Wang, and Y. Chen, "Hyperspectral image classification based on convolutional neural network and random forest," Remote Sensing Letters, vol. 10, no. 11, pp. 1086-1094, 2019. doi: 10.1080/2150704x.2019.1649736.
N. T. Dinh, N. T. U. Nhi, T. M. Le, and T. T. Van, "A model of image retrieval based on kd-tree random forest," Data Technologies and Applications, vol. 57, no. 4, pp. 514-536, 2023. doi: 10.1108/dta-06-2022-0247.
D. K. Prasad, L. Vibha, and K. R. Venugopal, "Early detection and multistage classification of diabetic retinopathy using random forest classifier," International Journal on Computer Science and Engineering, vol. 10, no. 3, pp. 77-84, 2018. doi: 10.21817/ijcse/2018/v10i3/181003012.
Y. Suganya, S. Ganesan, and P. Valarmathi, "Comparative analysis of ovarian images classification for identification of cyst using ensemble method machine learning approach," ECS Transactions, vol. 107, no. 1, pp. 7407-7415, 2022. doi: 10.1149/10701.7407ecst.
S. Perumal and T. Velmurugan, "Lung cancer detection and classification on ct scan images using enhanced artificial bee colony optimization," International Journal of Engineering & Technology, vol. 7, no. 2.26, p. 74, 2018. doi: 10.14419/ijet.v7i2.26.12538.
W. Feng, G. Dauphin, W. Huang, Y. Quan, and W. Liao, "New margin-based subsampling iterative technique in modified random forests for classification," Knowledge-Based Systems, vol. 182, p. 104845, 2019. doi: 10.1016/j.knosys.2019.07.016.
W. Feng, W. Huang, and W. Bao, "Imbalanced hyperspectral image classification with an adaptive ensemble method based on smote and rotation forest with differentiated sampling rates," IEEE Geoscience and Remote Sensing Letters, vol. 16, no. 12, pp. 1879-1883, 2019. doi: 10.1109/lgrs.2019.2913387.