AI-Driven Multimodal Fusion of Neuroimaging and Speech Analysis for Early Detection of Alzheimer’s Disease Biomarkers.

Authors

  • Sarita Sushil Gaikwad
  • Nilam Ajay Jadhav
  • Shital Gajbhiye
  • Bhavana Badhane
  • Avani Ray
  • Hemlata Suresh Gaikwad
  • Trupti Tukaram Tekale
  • Tejaswini Hanumant Gavhane

DOI:

https://doi.org/10.52783/jns.v14.3905

Keywords:

NA

Abstract

Alzheimer’s Disease (AD) is a progressively debilitating neurodegenerative disorder and is frequently diagnosed at advanced stages due to the lack of reliable, efficient early-stage biomarkers. Existing diagnostic techniques, including cerebrospinal fluid (CSF) analysis and positron emission tomography (PET), are invasive, costly, and not accessible in low-resource environments. Although structural and functional neuroimaging (MRI/fMRI) and speech analysis have individually been demonstrated to hold promise in the detection of AD, their potential to work synergistically is largely unexplored. To our knowledge, this study is the first to present a hybrid artificial intelligence (AI) framework that combines convolutional neural networks (CNNs) for neuroimaging analysis and transformer-based natural language processing (NLP) for speech pattern evaluation to detect early AD biomarkers with high sensitivity and specificity.

MRI/fMRI scans were extracted from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, and we collected a novel speech dataset comprising verbal fluency, picture description, and spontaneous speech from both AD patients, mild cognitive (MCI) subjects, and healthy controls. Our multimodal fusion model uses both a 3D CNN extractor for neuroimaging data and a fine-tuned BERT transformer for linguistic and paralinguistic features of speech (e.g. semantic coherence, syntactic complexity, and pause frequency). An attention-based fusion layer assigns dynamic weights to the contributions of imaging and speech modalities, which optimizes biomarker detection.

The experimental results showed that our model could accurately differentiate early AD from MCI with an accuracy of 92.3% (AUC: 0.96), where a prominent improvement was found in the classification performance as compared with unimodal approaches (MRI to AD: 82.1% accuracy; speech to AD: 76.5% accuracy). The model especially screened hippocampal atrophy and lexical repetition as the most discriminative ones. Longitudinal validation in a 3-year follow-up cohort showed a strong correlation between AI-predicted risk scores and clinical progression based on decline in Mini-Mental State Examination (MMSE) scores (r=0.85, p<0.001).

This study contributes:

  1. A novel multimodal AI frameworkfor early AD detection using non-invasive, cost-effective data.
  2. Empirical validationof speech and neuroimaging fusion, surpassing unimodal benchmarks.
  3. Clinical interpretabilitythrough saliency maps and attention weights, aligning with known AD pathology.

 

Downloads

Download data is not yet available.

Metrics

Metrics Loading ...

Downloads

Published

2025-04-17

How to Cite

1.
Sushil Gaikwad S, Ajay Jadhav N, Gajbhiye S, Badhane B, Ray A, Suresh Gaikwad H, Tukaram Tekale T, Hanumant Gavhane T. AI-Driven Multimodal Fusion of Neuroimaging and Speech Analysis for Early Detection of Alzheimer’s Disease Biomarkers. J Neonatal Surg [Internet]. 2025Apr.17 [cited 2025May15];14(15S):1535-42. Available from: https://jneonatalsurg.com/index.php/jns/article/view/3905