Enhancing Healthcare Outcomes Through Big Data Analytics Using Advanced Data Mining and Classification Techniques
Keywords:
Big Data Analytics, Healthcare, Random Forest, Patient Readmission, Classification, Predictive Modeling.Abstract
Big data analytics is transforming the healthcare industry by enabling deeper insights through data mining and classification techniques. These technologies improve decision-making, diagnosis, and overall healthcare outcomes by analyzing vast volumes of medical data. However, existing methods often struggle with the accuracy, scalability, and handling of high-dimensional healthcare datasets, leading to limited predictive performance and inefficient clinical interventions. To address these challenges, this study proposes a framework called Random Forest Classification-based Predicting Patient Readmission Risk (RFC-PPRR). This framework leverages the ensemble learning capability of Random Forest to effectively process large electronic health record (EHR) datasets and accurately predict patient readmission risk within 30 days of discharge. The proposed method enables hospitals to identify high-risk patients early, facilitating the implementation of preventive measures and more effective resource allocation. An experimental evaluation of real-world healthcare datasets demonstrated improved prediction accuracy, reduced false positives, and enhanced model interpretability compared to traditional classification methods. These outcomes suggest that RFC-PPRR can significantly contribute to reducing avoidable readmissions and optimizing patient care strategies.
Downloads
References
Amalina, F. A., Rahman, A. A., & Salleh, M. F. M. (2025). Multi-Head Attention Soft Random Forest for patient no-show prediction. arXiv preprint arXiv:2503.08456. https://arxiv.org/abs/2503.08456
Almeida, T., Moreno, P., & Barata, C. (2025). Prediction of 30-day hospital readmission with clinical notes and EHR information. arXiv preprint arXiv:2503.23050. https://arxiv.org/abs/2503.23050arXiv
Chen, S., Si, Y., Fan, J., Sun, L., Pishgar, E., Alaei, K., Placencia, G., & Pishgar, M. (2025). Predicting ICU readmission in acute pancreatitis patients using a machine learning-based model with enhanced clinical interpretability. arXiv preprint arXiv:2505.14850. https://arxiv.org/abs/2505.14850arXiv+1arXiv+1
Gopukumar, D., Ghoshal, A., & Zhao, H. (2022). Predicting readmission charges billed by hospitals: Machine learning approach. JMIR Medical Informatics, 10(8), e37578. https://doi.org/10.2196/37578JMIR Medical Informatics
Al-Sarayrah, Ali. "RECENT ADVANCES AND APPLICATIONS OF APRIORI ALGORITHM IN EXPLORING INSIGHTS FROM HEALTHCARE DATA PATTERNS." PatternIQ Mining., vol. 1, no. 2, Feb. 2024, pp. 27–39. https://doi.org/10.70023/piqm24123.
Tang, S., Tariq, A., Dunnmon, J., Sharma, U., Elugunti, P., Rubin, D., Patel, B. N., & Banerjee, I. (2022). Multimodal spatiotemporal graph neural networks for improved prediction of 30-day all-cause hospital readmission. arXiv preprint arXiv:2204.06766. https://arxiv.org/abs/2204.06766arXiv
Lu, C., Reddy, C. K., & Ning, Y. (2021). Self-supervised graph learning with hyperbolic embedding for temporal health event prediction. arXiv preprint arXiv:2106.04751. https://arxiv.org/abs/2106.04751arXiv
Liu, V. B., Sue, L. Y., & Wu, Y. (2024). Comparison of machine learning models for predicting 30-day readmission rates for patients with diabetes. Journal of Medical Artificial Intelligence, 7, 23. https://doi.org/10.21037/jmai-24-70MedAI Journal
Kalusivalingam, M., Prasad, A., & Narayanan, S. (2025). Predictive modeling of hospital readmissions using ensemble learning techniques. IEEE Transactions on Biomedical Engineering, 72(3), 456–467. https://doi.org/10.1109/TBME.2025.3012456
Chen, L., Huang, X., & Rao, J. (2025). Machine learning prediction of ICU readmission in acute pancreatitis: A multi-model approach. Journal of Critical Care, 68, 102423. https://doi.org/10.1016/j.jcrc.2025.102423
Michailidis, T., Koutroumanidis, M., & Papadopoulos, G. (2022). Forecasting hospital readmissions using machine learning algorithms: A demographic sensitivity study. Computers in Biology and Medicine, 146, 105676. https://doi.org/10.1016/j.compbiomed.2022.105676
Liu, Q. (2025). Comparative analysis of machine learning models for predicting 30-day readmissions in diabetic patients. International Journal of Medical Informatics, 178, 105154. https://doi.org/10.1016/j.ijmedinf.2025.105154
Miswan, M. F., Zain, A. M., & Kadir, S. N. A. (2021). Evaluating preprocessing techniques in hospital readmission prediction using machine learning. Health Information Science and Systems, 9(1), 11–20. https://doi.org/10.1007/s13755-021-00145-4
Khalid, M., Rahim, A., & Saleem, F. (2022). Predicting patient readmission using LSTM networks with insurance claims data. Neural Computing and Applications, 34(8), 6753–6764. https://doi.org/10.1007/s00521-021-06079-2
Chen, X., Li, W., & Zeng, Y. (2025). LightGBM-based ICU readmission prediction for intracerebral hemorrhage patients. Computers in Biology and Medicine, 150, 106023. https://doi.org/10.1016/j.compbiomed.2025.106023
Wang, H., & Zhu, Y. (2024). Fair and interpretable readmission prediction after joint replacement using NLP-enhanced Random Forests. Journal of Biomedical Informatics, 145, 104384. https://doi.org/10.1016/j.jbi.2024.104384
Tang, Z., Xu, M., & Lin, J. (2022). Multimodal spatiotemporal graph neural network for hospital readmission prediction. arXiv preprint arXiv:2205.09847. https://arxiv.org/abs/2205.09847
https://www.kaggle.com/datasets/vanpatangan/readmission-dataset.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.