Life expectancy at birth is a key indicator of national health and socioeconomic development. Identifying the main factors associated with its variation is essential for policymakers seeking to improve population well-being. This study employs a range of machine learning techniques to analyse the relationship between life expectancy and a set of macro-level indicators across 49 OECD countries over time. Multiple predictive models are implemented and compared, and the best-performing specification is used to assess the relative importance of the considered variables. In addition, controlled perturbations of key variables are introduced to examine the model-implied response of life expectancy predictions. The results consistently identify GDP per capita, healthcare expenditure, and PM2.5 concentrations as the variables most strongly associated with life expectancy. In particular, lower levels of air pollution are systematically linked to higher predicted longevity, consistent with the potential public health relevance of environmental improvements. Overall, the findings provide a comparative and data-driven assessment of the factors most closely related to life expectancy, suggesting that aligning environmental policies with economic and health investments can significantly improve population well-being

Life expectancy and its determinants: A machine learning analysis with implications for policy interventions

Milena Lopreite
;
Michelangelo Misuraca;Michelangelo Puliga
2026-01-01

Abstract

Life expectancy at birth is a key indicator of national health and socioeconomic development. Identifying the main factors associated with its variation is essential for policymakers seeking to improve population well-being. This study employs a range of machine learning techniques to analyse the relationship between life expectancy and a set of macro-level indicators across 49 OECD countries over time. Multiple predictive models are implemented and compared, and the best-performing specification is used to assess the relative importance of the considered variables. In addition, controlled perturbations of key variables are introduced to examine the model-implied response of life expectancy predictions. The results consistently identify GDP per capita, healthcare expenditure, and PM2.5 concentrations as the variables most strongly associated with life expectancy. In particular, lower levels of air pollution are systematically linked to higher predicted longevity, consistent with the potential public health relevance of environmental improvements. Overall, the findings provide a comparative and data-driven assessment of the factors most closely related to life expectancy, suggesting that aligning environmental policies with economic and health investments can significantly improve population well-being
2026
Longevity, Socioeconomic factors, Air quality, Machine learning model
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/406837
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact