Adaptive Correlation-Based Gene Expression Analysis Using Enhanced Ensemble Biclustering Framework

Authors

  • Manish Kumar Bhardwaj
  • Atul D. Newase

DOI:

https://doi.org/10.63682/jns.v13i1.9017

Keywords:

Gene expression, biclustering, correlation-based clustering, ensemble model, bioinformatics, modified Bimax, data mining

Abstract

The analysis of gene expression data plays a pivotal role in understanding complex biological functions, disease mechanisms, and gene regulation patterns. While biclustering methods such as Bimax have improved the identification of local gene-condition patterns, limitations persist in terms of computational complexity and biological relevance. This study proposes an Adaptive Ensemble Biclustering Framework (AEBF) that integrates multiple correlation-based biclustering algorithms, including an improved version of the modified Bimax, to enhance gene expression pattern discovery. The framework incorporates dynamic z-score normalization, adaptive outlier detection, and ensemble scoring based on size, coherence, and biological enrichment. Experimental validation on benchmark microarray datasets reveals that AEBF not only improves bicluster consistency and biological relevance but also demonstrates significant computational efficiency compared to standalone biclustering methods.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

References

Bhardwaj, M. K., & Rajpoot, S. S. (2023). Enhanced gene expression analysis using modified Bimax algorithm for correlation-based biclustering. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 9(2), 51–57.

Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, 8, 93–103.

Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., ... & Zitzler, E. (2006). A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, 22(9), 1122–1129. https://doi.org/10.1093/bioinformatics/btl060

Madeira, S. C., & Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1(1), 24–45. https://doi.org/10.1109/TCBB.2004.2

Lazzeroni, L., & Owen, A. (2002). Plaid models for gene expression data. Statistica Sinica, 12(1), 61–86.

Kluger, Y., Basri, R., Chang, J. T., & Gerstein, M. (2003). Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Research, 13(4), 703–716. https://doi.org/10.1101/gr.648603

Murali, T. M., & Kasif, S. (2003). Extracting conserved gene expression motifs from gene expression data. Bioinformatics, 19(Suppl. 1), i249–i258. https://doi.org/10.1093/bioinformatics/btg1050

Tanay, A., Sharan, R., & Shamir, R. (2002). Discovering statistically significant biclusters in gene expression data. Bioinformatics, 18(Suppl_1), S136–S144. https://doi.org/10.1093/bioinformatics/18.suppl_1.S136

Ihmels, J., Bergmann, S., & Barkai, N. (2004). Defining transcription modules using large-scale gene expression data. Bioinformatics, 20(13), 1993–2003. https://doi.org/10.1093/bioinformatics/bth166

Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., ... & Bischof, H. (2010). FABIA: Factor analysis for bicluster acquisition. Bioinformatics, 26(12), 1520–1527. https://doi.org/10.1093/bioinformatics/btq227

Henriques, R., Antunes, C., & Madeira, S. C. (2017). A structured view on pattern mining-based biclustering. BMC Bioinformatics, 18(1), 1–26. https://doi.org/10.1186/s12859-017-1686-1

Saelens, W., Cannoodt, R., Todorov, H., & Saeys, Y. (2018). A comparison of single-cell trajectory inference methods. Nature Biotechnology, 37, 547–554. https://doi.org/10.1038/s41587-019-0071-9

Bar-Joseph, Z., Gifford, D. K., & Jaakkola, T. S. (2001). Fast optimal leaf ordering for hierarchical clustering. Bioinformatics, 17(Suppl_1), S22–S29. https://doi.org/10.1093/bioinformatics/17.suppl_1.S22

Mitra, S., Pal, N. R., & Mitra, P. (2002). Data mining in soft computing framework: A survey. IEEE Transactions on Neural Networks, 13(1), 3–14. https://doi.org/10.1109/72.977268

Srivastava, D., & Gupta, A. (2019). Correlation-based co-clustering in gene expression analysis. Journal of Bioinformatics and Computational Biology, 17(6), 1940011. https://doi.org/10.1142/S0219720019400111

Sheng, Q., Moreau, Y., & De Moor, B. (2003). Biclustering microarray data by Gibbs sampling. Bioinformatics, 19(Suppl_2), ii196–ii205. https://doi.org/10.1093/bioinformatics/btg1086

Padilha, V. A., & Campello, R. J. G. B. (2017). A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics, 18(1), 1–21. https://doi.org/10.1186/s12859-017-1562-z

Bergmann, S., Ihmels, J., & Barkai, N. (2003). Iterative signature algorithm for the analysis of large-scale gene expression data. Physical Review E, 67(3), 031902. https://doi.org/10.1103/PhysRevE.67.031902

Sturn, A., Quackenbush, J., & Trajanoski, Z. (2002). Genesis: Cluster analysis of microarray data. Bioinformatics, 18(1), 207–208. https://doi.org/10.1093/bioinformatics/18.1.207

Langfelder, P., & Horvath, S. (2008). WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics, 9(1), 1–13. https://doi.org/10.1186/1471-2105-9-559

Rung, J., & Brazma, A. (2013). Reuse of public genome-wide gene expression data. Nature Reviews Genetics, 14(2), 89–99. https://doi.org/10.1038/nrg3394

Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36. https://doi.org/10.18637/jss.v061.i06

Zhang, B., & Horvath, S. (2005). A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology, 4(1), Article17. https://doi.org/10.2202/1544-6115.1128

Al Shalabi, L., & Shaaban, Z. (2007). Normalization as a preprocessing engine for data mining and the approach of preference matrix. International Journal of Computer Science and Network Security, 7(11), 41–46.

Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95(25), 14863–14868. https://doi.org/10.1073/pnas.95.25.14863

Downloads

Published

2025-08-27

How to Cite

1.
Bhardwaj MK, Newase AD. Adaptive Correlation-Based Gene Expression Analysis Using Enhanced Ensemble Biclustering Framework. J Neonatal Surg [Internet]. 2025Aug.27 [cited 2025Oct.12];13(1):985-91. Available from: https://jneonatalsurg.com/index.php/jns/article/view/9017

Issue

Section

Original Article