Prediction and Segmentation of Heart Disease Boosting-Based Machine Learning Algorithms

Authors

  • Ashok Kumar
  • Deepika Dhamija
  • Vikrant Chole
  • Jhankar Moolchandani
  • Rahul Kumar
  • Umang Garg

DOI:

https://doi.org/10.52783/jns.v14.2043

Keywords:

Lung Cancer prediction, Gradient Boost, XGBOOST, ADABOOST, CATBOOST, Cross validations, Data Balancing

Abstract

Recent advances in imaging and sequencing technologies have led to significant advancements in clinical research on lung cancer. However, the amount of information that the human brain can properly digest and utilize is limited. Lung cancer has been extensively detailed by integrating and analyzing this vast and complex amount of data from a variety of perspectives. Machine learning-based technologies are essential to this process. This study tests multiple Boosting algorithm models on a lung cancer dataset to determine a particular lung cancer disease prediction. The aim of this work is to determine the best cross-validation methods and boosting algorithms to enhance performance in lung disease predicting. The effectiveness of the method is evaluated using a number of performance metrics, such as recall, accuracy, precision, F-score, ROC AUC score, and cross validation score. The famous Lung Cancer Dataset is used in this academic paper to test a number of machine learning classification techniques based on boosting algorithms, including Gradient Boost (GB), Extended Boost - XGBOOST (XGB), Adaptive Boost (ADABOOST), Categorized Boost (CATBOOST), and Light Gradient Boost (LGBM). many Kfold cross-validation techniques. The impact of the ADASYN as a data balancing approach on the precision of lung cancer prediction employing algorithms is investigated through hybrid combinations of cross validation and boosting procedures.  This study presents a hybrid approach that could accurately predict the incidence of lung cancer. This study discovered that a hybrid integration of the Cross-validation approach with data balancing and the Boosting based ML Models built utilizing machine learning-based modeling category worked well to produce more accurate predictions regarding lung cancer.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Siegel, R.L.; Miller, K.D.; Jemal, A. “Cancer statistics”, 2020. CA Cancer J. Clin. 2020, 70, 7–30.

Sung, H.; Ferlay, J.; Siegel, R.L.; Laversanne, M.; Soerjomataram, I.; Jemal, A.; Bray, F. “Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 Countries”. CA Cancer J. Clin. 2021, 71, 209–249.

Shah, R.; Sabanathan, S.; Richardson, J.; Mearns, A.; Goulden, C. “Results of surgical treatment of stage i and ii lung cancer “.J. Cardiovasc. Surg. 1996, 37, 169–172.

Kuan, K.; Ravaut, M.; Manek, G.; Chen, H.; Lin, J.; Nazir, B.; Chen, C.; Howe, T.C.; Zeng, Z. “Deep learning for lung cancer detection: Tackling the kaggle data science bowl 2017 challenge”. arXiv 2017, arXiv:1705.09435. Diagnostics 2023, 13, 2617 23 of 27.

I. D. Mienye, Y. Sun, and Z. Wang, “Improved Predictive Sparse Decomposition Method With Densenet For Prediction of Lung Cancer,” International Journal of Computing. Research Institute for Intelligent Computer Systems, pp. 533–541, Dec. 30, 2020. doi: 10.47839/ijc.19.4.1986.

I. D. Mienye, G. Obaido, K. Aruleba and O. A. Dada, "Enhanced prediction of chronic kidney disease using feature selection and boosted classifiers" in Intelligent Systems Design and Applications, Cham, Switzerland, pp. 527-537, 2022.

J. D. Minna, J. A. Roth, and A. F. Gazdar, “Focus on lung cancer,” Cancer Cell, vol. 1, no. 1. Elsevier BV, pp. 49–52, Feb. 2002. doi: 10.1016/s1535-6108(02)00027-2.

D. B. Snoke, G. S. Atwood, E. R. Bellefleur, A. M. Stokes, and M. J. Toth, “Body composition alterations in patients with lung cancer,” American Journal of Physiology-Cell Physiology, vol. 328, no. 3. American Physiological Society, pp. C872–C886, Mar. 01, 2025. doi: 10.1152/ajpcell.01048.2024.

Ling S, Hu Z, Yang Z, Yang F, Li Y, Lin P, et al. “Extremely high genetic diversity in a single tumor point to prevalence of nondarwinian cell evolution”. Proc Natl Acad Sci U S A 2015;112: E6496–505.

International Cancer Genome Consortium, Hudson TJ, Anderson W, Artez A, Barker AD, Bell C, et al. “international network of cancer genome projects”. Nature 2010; 464:993–8.

Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, et al. “The Cancer Genome Atlas Pan-Cancer analysis project”. Nat Genet 2013; 45:1113–20.

D. B. Snoke, G. S. Atwood, E. R. Bellefleur, A. M. Stokes, and M. J. Toth, “Body composition alterations in patients with lung cancer,” American Journal of Physiology-Cell Physiology, vol. 328, no. 3. American Physiological Society, pp. C872–C886, Mar. 01, 2025. doi: 10.1152/ajpcell.01048.2024.

Schabath MB, Cote ML. Cancer progress and priorities: lung cancer cancer. Epidemiol Biomarkers Prev. 2019;28(10):1563–79.

Wang R, Dai W, Gong J, Huang M, Hu T, Li H, Lin K, Tan C, Hu H, Tong T, Cai G. “Development of a novel combined nomogram model integrating deep learning pathomics, radiomics and immune score to predict postoperative outcome of colorectal cancer lung metastasis patients”. Journal of Hematology & Oncology volume 15, Article number: 11 (2022)

Mu Y, Zhou Y, Wang Y, Li W, Zhou L, Lu X, Gao P, Gao M, Zhao Y, Wang Q, Wang Y, Xu G. “Serum metabolomics study of non-smoking female patients with non-small cell lung cancer using gas chromatography-mass spectrometry”. J Proteome Res. 2019; 18:2175–84.

Puneet and A. Chauhan, "Detection of Lung Cancer using Machine Learning Techniques Based on Routine Blood Indices," 2020 IEEE International Conference for Innovation in Technology (INOCON), 2020, pp. 1-6, doi: 10.1109/INOCON50539.2020.9298407.

M. I. Faisal, S. Bashir, Z. S. Khan and F. Hassan Khan, "An Evaluation of Machine Learning Classifiers and Ensembles for Early-Stage Prediction of Lung Cancer," 2018 3rd International Conference on Emerging Trends in Engineering, Sciences and Technology (ICEEST),2018, pp. 1-4, doi: 10.1109/ICEEST.2018.8643311.

A. Safiyari and R. Javidan, "Predicting lung cancer survivability using ensemble learning methods," 2017 Intelligent Systems Conference (IntelliSys), 2017, pp. 684-688, doi: 10.1109/IntelliSys.2017.8324368 19. Mamun M., Farjana A., al Mamun M., Ahammed M.S. 2022 IEEE World AI IoT Congress, AIIoT 2022. 2022. “Lung cancer prediction model using ensemble learning techniques and a systematic review analysis”; pp. 187–193. [CrossRef]

Patra R. “Prediction of lung cancer using machine learning classifier”; Communications in Computer and Information Science. Vol. 1235. CCIS; 2020pp. 132– 142.

Jin-ah Sim J., et al. “The major effects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning”. Sci Rep. Dec. 2020;10(1) doi: 10.1038/s41598-020-67604-3.

Y. Sun, Z. Li, X. Li and J. Zhang, "Classifier selection and ensemble model for multi-class imbalance learning in education grants prediction", Appl. Artif. Intell., vol. 35, no. 4, pp. 290-303, Mar. 2021.

T. Chen and C. Guestrin, “Xgboost: Reliable large-scale tree boosting system,” in Proceedings of the 22nd SIGKDD Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2015, pp. 13–17.

Freund, Y., Schapire, R.E.: “A decision-theoretic generalization of on-line learning and an application to boosting”. Journal of Computer and System Sciences 55(1), 119–139 (1997)

Friedman, J. (2001). “Greedy boosting approximation: a gradient boosting machine”. Ann. Stat. 29, 1189–1232. doi: 10.1214/aos/1013203451 https://proceedings.neurips.cc/paper/2017/file/6449f44 a102fde848669bdd9eb6b76fa-Paper.pdf

Berrar, D. “Cross Validation”; Data Science Laboratory, Tokyo Institute of Technology: Tokyo, Japan, 2018

Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. “Scikit-learn: machine learning in Python”. J Mach Learn Res. 2011; 12:2825– 30

D. M. W. Powers, “Evaluation: From precision, recall and f-measure to roc., informedness, markedness & correlation,” Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.

H. B. M. Mohammed and N. Cavus, “Utilization of Detection of Non-Speech Sound for Sustainable Quality of Life for Deaf and Hearing-Impaired People: A Systematic Literature Review,” Sustainability, vol. 16, no. 20. MDPI AG, p. 8976, Oct. 17, 2024. doi: 10.3390/su16208976.

N. O. Beese et al., “Feel me, hear me: vibrotactile and auditory feedback cues in an invisible object search in virtual reality,” Behaviour & Information Technology. Informa UK Limited, pp. 1–12, Feb. 13, 2025. doi: 10.1080/0144929x.2025.2459248.

Downloads

Published

2025-03-11

How to Cite

1.
Kumar A, Dhamija D, Chole V, Moolchandani J, Kumar R, Garg U. Prediction and Segmentation of Heart Disease Boosting-Based Machine Learning Algorithms. J Neonatal Surg [Internet]. 2025Mar.11 [cited 2025Mar.20];14(5S):324-3. Available from: https://jneonatalsurg.com/index.php/jns/article/view/2043