Enhancing Healthcare Outcomes Through Big Data Analytics Using Advanced Data Mining and Classification Techniques

Authors

  • Sreevidya N R
  • L. Sudha

Keywords:

Big Data Analytics, Healthcare, Random Forest, Patient Readmission, Classification, Predictive Modeling.

Abstract


Big data analytics is transforming the healthcare industry by enabling deeper insights through data mining and classification techniques. These technologies improve decision-making, diagnosis, and overall healthcare outcomes by analyzing vast volumes of medical data. However, existing methods often struggle with the accuracy, scalability, and handling of high-dimensional healthcare datasets, leading to limited predictive performance and inefficient clinical interventions. To address these challenges, this study proposes a framework called Random Forest Classification-based Predicting Patient Readmission Risk (RFC-PPRR). This framework leverages the ensemble learning capability of Random Forest to effectively process large electronic health record (EHR) datasets and accurately predict patient readmission risk within 30 days of discharge. The proposed method enables hospitals to identify high-risk patients early, facilitating the implementation of preventive measures and more effective resource allocation. An experimental evaluation of real-world healthcare datasets demonstrated improved prediction accuracy, reduced false positives, and enhanced model interpretability compared to traditional classification methods. These outcomes suggest that RFC-PPRR can significantly contribute to reducing avoidable readmissions and optimizing patient care strategies.

Downloads

Download data is not yet available.

References

Amalina, F. A., Rahman, A. A., & Salleh, M. F. M. (2025). Multi-Head Attention Soft Random Forest for patient no-show prediction. arXiv preprint arXiv:2503.08456. https://arxiv.org/abs/2503.08456

Almeida, T., Moreno, P., & Barata, C. (2025). Prediction of 30-day hospital readmission with clinical notes and EHR information. arXiv preprint arXiv:2503.23050. https://arxiv.org/abs/2503.23050arXiv

Chen, S., Si, Y., Fan, J., Sun, L., Pishgar, E., Alaei, K., Placencia, G., & Pishgar, M. (2025). Predicting ICU readmission in acute pancreatitis patients using a machine learning-based model with enhanced clinical interpretability. arXiv preprint arXiv:2505.14850. https://arxiv.org/abs/2505.14850arXiv+1arXiv+1

Gopukumar, D., Ghoshal, A., & Zhao, H. (2022). Predicting readmission charges billed by hospitals: Machine learning approach. JMIR Medical Informatics, 10(8), e37578. https://doi.org/10.2196/37578JMIR Medical Informatics

Al-Sarayrah, Ali. "RECENT ADVANCES AND APPLICATIONS OF APRIORI ALGORITHM IN EXPLORING INSIGHTS FROM HEALTHCARE DATA PATTERNS." PatternIQ Mining., vol. 1, no. 2, Feb. 2024, pp. 27–39. https://doi.org/10.70023/piqm24123.

Tang, S., Tariq, A., Dunnmon, J., Sharma, U., Elugunti, P., Rubin, D., Patel, B. N., & Banerjee, I. (2022). Multimodal spatiotemporal graph neural networks for improved prediction of 30-day all-cause hospital readmission. arXiv preprint arXiv:2204.06766. https://arxiv.org/abs/2204.06766arXiv

Lu, C., Reddy, C. K., & Ning, Y. (2021). Self-supervised graph learning with hyperbolic embedding for temporal health event prediction. arXiv preprint arXiv:2106.04751. https://arxiv.org/abs/2106.04751arXiv

Liu, V. B., Sue, L. Y., & Wu, Y. (2024). Comparison of machine learning models for predicting 30-day readmission rates for patients with diabetes. Journal of Medical Artificial Intelligence, 7, 23. https://doi.org/10.21037/jmai-24-70MedAI Journal

Kalusivalingam, M., Prasad, A., & Narayanan, S. (2025). Predictive modeling of hospital readmissions using ensemble learning techniques. IEEE Transactions on Biomedical Engineering, 72(3), 456–467. https://doi.org/10.1109/TBME.2025.3012456

Chen, L., Huang, X., & Rao, J. (2025). Machine learning prediction of ICU readmission in acute pancreatitis: A multi-model approach. Journal of Critical Care, 68, 102423. https://doi.org/10.1016/j.jcrc.2025.102423

Michailidis, T., Koutroumanidis, M., & Papadopoulos, G. (2022). Forecasting hospital readmissions using machine learning algorithms: A demographic sensitivity study. Computers in Biology and Medicine, 146, 105676. https://doi.org/10.1016/j.compbiomed.2022.105676

Liu, Q. (2025). Comparative analysis of machine learning models for predicting 30-day readmissions in diabetic patients. International Journal of Medical Informatics, 178, 105154. https://doi.org/10.1016/j.ijmedinf.2025.105154

Miswan, M. F., Zain, A. M., & Kadir, S. N. A. (2021). Evaluating preprocessing techniques in hospital readmission prediction using machine learning. Health Information Science and Systems, 9(1), 11–20. https://doi.org/10.1007/s13755-021-00145-4

Khalid, M., Rahim, A., & Saleem, F. (2022). Predicting patient readmission using LSTM networks with insurance claims data. Neural Computing and Applications, 34(8), 6753–6764. https://doi.org/10.1007/s00521-021-06079-2

Chen, X., Li, W., & Zeng, Y. (2025). LightGBM-based ICU readmission prediction for intracerebral hemorrhage patients. Computers in Biology and Medicine, 150, 106023. https://doi.org/10.1016/j.compbiomed.2025.106023

Wang, H., & Zhu, Y. (2024). Fair and interpretable readmission prediction after joint replacement using NLP-enhanced Random Forests. Journal of Biomedical Informatics, 145, 104384. https://doi.org/10.1016/j.jbi.2024.104384

Tang, Z., Xu, M., & Lin, J. (2022). Multimodal spatiotemporal graph neural network for hospital readmission prediction. arXiv preprint arXiv:2205.09847. https://arxiv.org/abs/2205.09847

https://www.kaggle.com/datasets/vanpatangan/readmission-dataset.

Downloads

Published

2025-10-21

How to Cite

1.
N R S, Sudha L. Enhancing Healthcare Outcomes Through Big Data Analytics Using Advanced Data Mining and Classification Techniques. J Neonatal Surg [Internet]. 2025 Oct. 21 [cited 2026 Apr. 14];14(32S):9012-27. Available from: https://jneonatalsurg.com/index.php/jns/article/view/9387