AI-Driven Multimodal Fusion of Neuroimaging and Speech Analysis for Early Detection of Alzheimer’s Disease Biomarkers.

Authors

Sarita Sushil Gaikwad
Nilam Ajay Jadhav
Shital Gajbhiye
Bhavana Badhane
Avani Ray
Hemlata Suresh Gaikwad
Trupti Tukaram Tekale
Tejaswini Hanumant Gavhane

DOI:

https://doi.org/10.63682/jns.v14i15S.3905

Keywords:

Abstract

Alzheimer’s Disease (AD) is a progressively debilitating neurodegenerative disorder and is frequently diagnosed at advanced stages due to the lack of reliable, efficient early-stage biomarkers. Existing diagnostic techniques, including cerebrospinal fluid (CSF) analysis and positron emission tomography (PET), are invasive, costly, and not accessible in low-resource environments. Although structural and functional neuroimaging (MRI/fMRI) and speech analysis have individually been demonstrated to hold promise in the detection of AD, their potential to work synergistically is largely unexplored. To our knowledge, this study is the first to present a hybrid artificial intelligence (AI) framework that combines convolutional neural networks (CNNs) for neuroimaging analysis and transformer-based natural language processing (NLP) for speech pattern evaluation to detect early AD biomarkers with high sensitivity and specificity.

MRI/fMRI scans were extracted from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, and we collected a novel speech dataset comprising verbal fluency, picture description, and spontaneous speech from both AD patients, mild cognitive (MCI) subjects, and healthy controls. Our multimodal fusion model uses both a 3D CNN extractor for neuroimaging data and a fine-tuned BERT transformer for linguistic and paralinguistic features of speech (e.g. semantic coherence, syntactic complexity, and pause frequency). An attention-based fusion layer assigns dynamic weights to the contributions of imaging and speech modalities, which optimizes biomarker detection.

The experimental results showed that our model could accurately differentiate early AD from MCI with an accuracy of 92.3% (AUC: 0.96), where a prominent improvement was found in the classification performance as compared with unimodal approaches (MRI to AD: 82.1% accuracy; speech to AD: 76.5% accuracy). The model especially screened hippocampal atrophy and lexical repetition as the most discriminative ones. Longitudinal validation in a 3-year follow-up cohort showed a strong correlation between AI-predicted risk scores and clinical progression based on decline in Mini-Mental State Examination (MMSE) scores (r=0.85, p<0.001).

This study contributes:

A novel multimodal AI frameworkfor early AD detection using non-invasive, cost-effective data.
Empirical validationof speech and neuroimaging fusion, surpassing unimodal benchmarks.
Clinical interpretabilitythrough saliency maps and attention weights, aligning with known AD pathology.

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Downloads

Published

2025-04-17

How to Cite

Sushil Gaikwad S, Ajay Jadhav N, Gajbhiye S, Badhane B, Ray A, Suresh Gaikwad H, Tukaram Tekale T, Hanumant Gavhane T. AI-Driven Multimodal Fusion of Neuroimaging and Speech Analysis for Early Detection of Alzheimer’s Disease Biomarkers. J Neonatal Surg [Internet]. 2025Apr.17 [cited 2025Oct.9];14(15S):1535-42. Available from: https://jneonatalsurg.com/index.php/jns/article/view/3905

Download Citation

Issue

Vol. 14 No. 15S (2025): Journal of Neonatal Surgery

Section

Original Article

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

You are free to:

Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material for any purpose, even commercially.

Terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.