Ensemble Learning for Multiple Sclerosis Disability Estimation Using Brain Structural Connectivity

Barile, Berardino; Marzullo, Aldo; Stamile, Claudio; Durand-Dubief, Françoise; Sappey-Marinier, Dominique

doi:10.1089/brain.2020.1003

Background: Multiple sclerosis (MS) is an autoimmune inflammatory disease of the central nervous system characterized by demyelination and neurodegeneration processes. It leads to different clinical courses and degrees of disability that need to be anticipated by the neurologist for personalized therapy. Recently, machine learning (ML) techniques have reached a high level of performance in brain disease diagnosis and/or prognosis, but the decision process of a trained ML system is typically nontransparent. Using brain structural connectivity data, a fully automatic ensemble learning model, augmented with an interpretable model, is proposed for the estimation of MS patients' disability, measured by the Expanded Disability Status Scale (EDSS).Materials and Methods: An ensemble of four boosting-based models (GBM, XGBoost, CatBoost, and LightBoost) organized following a stacking generalization scheme was developed using diffusion tensor imaging (DTI)-based structural connectivity data. In addition, an interpretable model based on conditional logistic regression was developed to explain the best performances in terms of white matter (WM) links for three classes of EDSS (low, medium, and high).Results: The ensemble model reached excellent level of performance (root mean squared error of 0.92 +/- 0.28) compared with single-based models and provided a better EDSS estimation using DTI-based structural connectivity data compared with conventional magnetic resonance imaging measures associated with patient data (age, gender, and disease duration). Used for interpretation of the estimation process, the counterfactual method showed the importance of certain brain networks, corresponding mainly to left hemisphere WM links, connecting the left superior temporal with the left posterior cingulate and the right precuneus gray matter regions, and the interhemispheric WM links constituting the corpus callosum. Also, a better accuracy estimation was found for the high disability class.Conclusion: The combination of advanced ML models and sensitive techniques such as DTI-based structural connectivity demonstrated to be useful for the estimation of MS patients' disability and to point out the most important brain WM networks involved in disability. Impact statement An ensemble of "boosting" machine learning (ML) models was more performant than single models to estimate disability in multiple sclerosis.Diffusion tensor imaging (DTI)-based structural connectivity led to better performance than conventional magnetic resonance imaging.An interpretable model, based on counterfactual perturbation, highlighted the most relevant white matter fiber links for disability estimation.These findings demonstrated the clinical interest of combining DTI, graph modeling, and ML techniques.