Predictive Modeling of Osteoporosis Risk Factors using XGBoost and Bagging Ensemble Technique

I Irmawati; Eka  Herdit Juningsih; Y Yanto

doi:10.37034/medinftech.v2i1.27

Authors

I Irmawati Universitas Bina Sarana Informatika
Eka Herdit Juningsih Universitas Bina Sarana Informatika
Y Yanto Universitas Bina Sarana Informatika

DOI:

https://doi.org/10.37034/medinftech.v2i1.27

Keywords:

Bagging, Ensemble Technique, Prediction, Osteoporosis Risk Assessment, XGBoost

Abstract

This study presents a predictive modeling framework for osteoporosis risk assessment using ensemble techniques, specifically XGBoost and Bagging. Leveraging a dataset comprising comprehensive health factors influencing osteoporosis development, including demographic details, lifestyle choices, medical history, and bone health indicators, the aim is to facilitate accurate identification of individuals at risk. The dataset consists of 1958 samples, evenly distributed between osteoporosis-positive and osteoporosis-negative cases. The methodology involves the separation of features and labels, followed by data splitting into training and testing sets. XGBoost, a powerful gradient boosting algorithm, is employed as the base estimator within a Bagging ensemble, enhancing predictive accuracy and generalization. The model is trained on the training set and evaluated using cross-validation techniques to ensure robustness and mitigate overfitting. The results of the classification report demonstrate promising performance metrics, with an overall accuracy of 88% on the test set. Precision and recall scores indicate strong predictive capabilities, particularly in correctly identifying osteoporosis-positive cases. The novel integration of XGBoost within a Bagging ensemble provides an innovative approach to osteoporosis risk prediction, harnessing the strengths of both algorithms to improve model performance. This research contributes to the advancement of osteoporosis management and prevention strategies by providing a reliable tool for early risk assessment. The combination of machine learning techniques with comprehensive health data offers a valuable approach to personalized healthcare, enabling targeted interventions and optimized resource allocation. Ultimately, this study aims to enhance patient outcomes and reduce the burden of osteoporosis-related morbidity and mortality.

Downloads

Download data is not yet available.

References

J. Barnsley et al., “Pathophysiology and treatment of osteoporosis: challenges for clinical practice in older people,” Aging Clin Exp Res, vol. 33, no. 4, pp. 759–773, Apr. 2021, doi: 10.1007/s40520-021-01817-y.

M. Chandran et al., “Prevalence of osteoporosis and incidence of related fractures in developed economies in the Asia Pacific region: a systematic review,” Osteoporos Int, vol. 34, no. 6, pp. 1037–1053, Jun. 2023, doi: 10.1007/s00198-022-06657-8.

K. A. Ullah, F. Rehman, M. Anwar, M. Faheem, and N. Riaz, “Machine learning-based prediction of osteoporosis in postmenopausal women with clinical examined features: A quantitative clinical study,” Health Science Reports, vol. 6, no. 10, p. e1656, 2023, doi: 10.1002/hsr2.1656.

W. D. Leslie and S. N. Morin, “New Developments in Fracture Risk Assessment for Current Osteoporosis Reports,” Curr Osteoporos Rep, vol. 18, no. 3, pp. 115–129, Jun. 2020, doi: 10.1007/s11914-020-00590-7.

Y. Chairul, F. Aziz, and S. Hadianti, “Relevance of e-Health Needs and Usage in Indonesia,” Journal Medical Informatics Technology, pp. 91–95, 2023.

R. J. Woodman and A. A. Mangoni, “A comprehensive review of machine learning algorithms and their application in geriatric medicine: present and future,” Aging Clin Exp Res, vol. 35, no. 11, pp. 2363–2397, Nov. 2023, doi: 10.1007/s40520-023-02552-2.

R. Kumar, B. Rai, and P. Samui, “A comparative study of prediction of compressive strength of ultra-high performance concrete using soft computing technique,” Structural Concrete, vol. 24, no. 4, pp. 5538–5555, 2023, doi: 10.1002/suco.202200850.

V. Borisov, T. Leemann, K. Seßler, J. Haug, M. Pawelczyk, and G. Kasneci, “Deep Neural Networks and Tabular Data: A Survey,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21, 2022, doi: 10.1109/TNNLS.2022.3229161.

T. B. Brown et al., “Language Models are Few-Shot Learners.” arXiv, Jul. 22, 2020. doi: 10.48550/arXiv.2005.14165.

H. Xiang, Q. Zou, M. A. Nawaz, X. Huang, F. Zhang, and H. Yu, “Deep learning for image inpainting: A survey,” Pattern Recognition, vol. 134, p. 109046, Feb. 2023, doi: 10.1016/j.patcog.2022.109046.

S. C. Fanni, M. Febi, G. Aghakhanyan, and E. Neri, “Natural Language Processing,” in Introduction to Artificial Intelligence, M. E. Klontzas, S. C. Fanni, and E. Neri, Eds., Cham: Springer International Publishing, 2023, pp. 87–99. doi: 10.1007/978-3-031-25928-9_5.

“Toward Artificial General Intelligence: Deep Learning, Neural Networks, Generative AI,” in Toward Artificial General Intelligence, De Gruyter, 2023. doi: 10.1515/9783111323749.

R. M. D. Saputra, Y. Chairul, D. Riana, A. S. Hewiz, and F. Aziz, “Stroke Prediction Based on Random Forest with SMOTE,” in 2023 International Conference on Information Technology Research and Innovation (ICITRI), IEEE, 2023, pp. 17–21. Accessed: Mar. 14, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10249261/

F. Aziz, “A Tripartite Machine Learning Approach for Accurate Prognosis of COVID-19 Patient Survival,” Journal Medical Informatics Technology, pp. 70–74, Sep. 2023, doi: 10.37034/medinftech.v1i3.13.

M. M. Ahsan, S. A. Luna, and Z. Siddique, “Machine-Learning-Based Disease Diagnosis: A Comprehensive Review,” Healthcare, vol. 10, no. 3, Art. no. 3, Mar. 2022, doi: 10.3390/healthcare10030541.

G. Kumawat, S. K. Vishwakarma, P. Chakrabarti, P. Chittora, T. Chakrabarti, and J. C.-W. Lin, “Prognosis of Cervical Cancer Disease by Applying Machine Learning Techniques,” J CIRCUIT SYST COMP, vol. 32, no. 01, p. 2350019, Jan. 2023, doi: 10.1142/S0218126623500196.

Dwiza Riana et al., “Comparison of Segmentation Analysis in Nucleus Detection with GLCM Features using Otsu and Polynomial Methods,” J. RESTI (Rekayasa Sist. Teknol. Inf.), vol. 7, no. 6, pp. 1422–1429, Dec. 2023, doi: 10.29207/resti.v7i6.5420.

K. Kourou, K. P. Exarchos, C. Papaloukas, P. Sakaloglou, T. Exarchos, and D. I. Fotiadis, “Applied machine learning in cancer research: A systematic review for patient diagnosis, classification and prognosis,” Computational and Structural Biotechnology Journal, vol. 19, pp. 5546–5555, Jan. 2021, doi: 10.1016/j.csbj.2021.10.006.

X. Wu and S. Park, “A Prediction Model for Osteoporosis Risk Using a Machine-Learning Approach and Its Validation in a Large Cohort,” J Korean Med Sci, vol. 38, no. 21, p. e162, Apr. 2023, doi: 10.3346/jkms.2023.38.e162.

J.-B. Tu, W.-J. Liao, W.-C. Liu, and X.-H. Gao, “Using machine learning techniques to predict the risk of osteoporosis based on nationwide chronic disease data,” Sci Rep, vol. 14, no. 1, p. 5245, Mar. 2024, doi: 10.1038/s41598-024-56114-1.

A. Kulkarni, “Osteoporosis Risk Prediction,” Osteoporosis Risk Prediction. Accessed: Mar. 14, 2024. [Online]. Available: https://www.kaggle.com/datasets/amitvkulkarni/lifestyle-factors-influencing-osteoporosis/data

O. Rainio, J. Teuho, and R. Klén, “Evaluation metrics and statistical tests for machine learning,” Sci Rep, vol. 14, no. 1, p. 6086, Mar. 2024, doi: 10.1038/s41598-024-56706-x.