Adaptive Correlation-Based Gene Expression Analysis Using Enhanced Ensemble Biclustering Framework
DOI:
https://doi.org/10.63682/jns.v13i1.9017Keywords:
Gene expression, biclustering, correlation-based clustering, ensemble model, bioinformatics, modified Bimax, data miningAbstract
The analysis of gene expression data plays a pivotal role in understanding complex biological functions, disease mechanisms, and gene regulation patterns. While biclustering methods such as Bimax have improved the identification of local gene-condition patterns, limitations persist in terms of computational complexity and biological relevance. This study proposes an Adaptive Ensemble Biclustering Framework (AEBF) that integrates multiple correlation-based biclustering algorithms, including an improved version of the modified Bimax, to enhance gene expression pattern discovery. The framework incorporates dynamic z-score normalization, adaptive outlier detection, and ensemble scoring based on size, coherence, and biological enrichment. Experimental validation on benchmark microarray datasets reveals that AEBF not only improves bicluster consistency and biological relevance but also demonstrates significant computational efficiency compared to standalone biclustering methods.
Downloads
Metrics
References
Bhardwaj, M. K., & Rajpoot, S. S. (2023). Enhanced gene expression analysis using modified Bimax algorithm for correlation-based biclustering. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 9(2), 51–57.
Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, 8, 93–103.
Prelić, A., Bleuler, S., Zimmermann, P., Wille, A., Bühlmann, P., Gruissem, W., ... & Zitzler, E. (2006). A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics, 22(9), 1122–1129. https://doi.org/10.1093/bioinformatics/btl060
Madeira, S. C., & Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1(1), 24–45. https://doi.org/10.1109/TCBB.2004.2
Lazzeroni, L., & Owen, A. (2002). Plaid models for gene expression data. Statistica Sinica, 12(1), 61–86.
Kluger, Y., Basri, R., Chang, J. T., & Gerstein, M. (2003). Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Research, 13(4), 703–716. https://doi.org/10.1101/gr.648603
Murali, T. M., & Kasif, S. (2003). Extracting conserved gene expression motifs from gene expression data. Bioinformatics, 19(Suppl. 1), i249–i258. https://doi.org/10.1093/bioinformatics/btg1050
Tanay, A., Sharan, R., & Shamir, R. (2002). Discovering statistically significant biclusters in gene expression data. Bioinformatics, 18(Suppl_1), S136–S144. https://doi.org/10.1093/bioinformatics/18.suppl_1.S136
Ihmels, J., Bergmann, S., & Barkai, N. (2004). Defining transcription modules using large-scale gene expression data. Bioinformatics, 20(13), 1993–2003. https://doi.org/10.1093/bioinformatics/bth166
Hochreiter, S., Bodenhofer, U., Heusel, M., Mayr, A., Mitterecker, A., Kasim, A., ... & Bischof, H. (2010). FABIA: Factor analysis for bicluster acquisition. Bioinformatics, 26(12), 1520–1527. https://doi.org/10.1093/bioinformatics/btq227
Henriques, R., Antunes, C., & Madeira, S. C. (2017). A structured view on pattern mining-based biclustering. BMC Bioinformatics, 18(1), 1–26. https://doi.org/10.1186/s12859-017-1686-1
Saelens, W., Cannoodt, R., Todorov, H., & Saeys, Y. (2018). A comparison of single-cell trajectory inference methods. Nature Biotechnology, 37, 547–554. https://doi.org/10.1038/s41587-019-0071-9
Bar-Joseph, Z., Gifford, D. K., & Jaakkola, T. S. (2001). Fast optimal leaf ordering for hierarchical clustering. Bioinformatics, 17(Suppl_1), S22–S29. https://doi.org/10.1093/bioinformatics/17.suppl_1.S22
Mitra, S., Pal, N. R., & Mitra, P. (2002). Data mining in soft computing framework: A survey. IEEE Transactions on Neural Networks, 13(1), 3–14. https://doi.org/10.1109/72.977268
Srivastava, D., & Gupta, A. (2019). Correlation-based co-clustering in gene expression analysis. Journal of Bioinformatics and Computational Biology, 17(6), 1940011. https://doi.org/10.1142/S0219720019400111
Sheng, Q., Moreau, Y., & De Moor, B. (2003). Biclustering microarray data by Gibbs sampling. Bioinformatics, 19(Suppl_2), ii196–ii205. https://doi.org/10.1093/bioinformatics/btg1086
Padilha, V. A., & Campello, R. J. G. B. (2017). A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics, 18(1), 1–21. https://doi.org/10.1186/s12859-017-1562-z
Bergmann, S., Ihmels, J., & Barkai, N. (2003). Iterative signature algorithm for the analysis of large-scale gene expression data. Physical Review E, 67(3), 031902. https://doi.org/10.1103/PhysRevE.67.031902
Sturn, A., Quackenbush, J., & Trajanoski, Z. (2002). Genesis: Cluster analysis of microarray data. Bioinformatics, 18(1), 207–208. https://doi.org/10.1093/bioinformatics/18.1.207
Langfelder, P., & Horvath, S. (2008). WGCNA: An R package for weighted correlation network analysis. BMC Bioinformatics, 9(1), 1–13. https://doi.org/10.1186/1471-2105-9-559
Rung, J., & Brazma, A. (2013). Reuse of public genome-wide gene expression data. Nature Reviews Genetics, 14(2), 89–99. https://doi.org/10.1038/nrg3394
Charrad, M., Ghazzali, N., Boiteau, V., & Niknafs, A. (2014). NbClust: An R package for determining the relevant number of clusters in a data set. Journal of Statistical Software, 61(6), 1–36. https://doi.org/10.18637/jss.v061.i06
Zhang, B., & Horvath, S. (2005). A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology, 4(1), Article17. https://doi.org/10.2202/1544-6115.1128
Al Shalabi, L., & Shaaban, Z. (2007). Normalization as a preprocessing engine for data mining and the approach of preference matrix. International Journal of Computer Science and Network Security, 7(11), 41–46.
Eisen, M. B., Spellman, P. T., Brown, P. O., & Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences, 95(25), 14863–14868. https://doi.org/10.1073/pnas.95.25.14863
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
Terms:
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.