A Machine Learning Framework for Early Detection of Type 2 Diabetes through Multimodal Clinical and Lifestyle Indicators
Abstract
The early identification of Type 2 Diabetes Mellitus (T2DM) still presents a major obstacle to the prevention as well as to the management of the disease. The present investigation offers the idea of a large-scale machine learning architecture that amalgamates various data sources—the ones coming from the clinical indicators (e. g., glucose, BMI, blood pressure) and the lifestyle or symptom-based features (e. g., polyuria, polydipsia, obesity)—to allow for a precise and interpretable early-stage diabetes prediction. The framework applies three complementary strategies of fusion on two publicly accessible datasets: early fusion (feature-level integration), late fusion (probability-level ensemble learning), and intermediate fusion (latent representation via principal component analysis). Assessments comparative to logistic regression and XGBoost models revealed the effectiveness of the multimodal fusion in preference to the single-modality models, with the performance reflected in the ROC–AUC values of 0.991 and 1.000 on the lifestyle dataset and 0.813 and 0.826 on the clinical dataset, respectively. Calibration and decision-curve analyses assured the models’ robustness and clinical utility while SHAP and permutation-based feature importance provided interpretability at both global and local levels. This study suggests that AI-driven multimodal integration is a cost-effective and scalable approach for early T2DM screening, especially in resource-limited settings where both clinical and behavioral data are available.
How to Cite This Article
Ahmed Younus Ahmed, Hussein Ali Shaker (2026). A Machine Learning Framework for Early Detection of Type 2 Diabetes through Multimodal Clinical and Lifestyle Indicators . International Journal of Future Engineering Innovations (IJFEI), 3(2), 44-50. DOI: https://doi.org/10.54660/IJFEI.2026.3.2.44-50