Decision-making in healthcare relies on understanding patients' past and current health states to predict and, ultimately, change their future course. In this study, the authors modified the GPT (generative pretrained transformer) architecture to model the progression and competing nature of human diseases.
The resulting model, Delphi-2M, was trained on data from 0.4 million UK Biobank participants and validated on 1.9 million Danish individuals with no change in parameters. Delphi-2M predicts the rates of more than 1,000 diseases conditional on each individual's past disease history, with accuracy comparable to existing single-disease models.
Its generative nature enables sampling of synthetic future health trajectories, providing meaningful estimates of potential disease burden for up to 20 years. Explainable AI methods reveal clusters of co-morbidities and time-dependent consequences on future health, while also highlighting biases learnt from training data.
The following summaries describe the main figures in the Nature paper. View the full figures and extended data in the original article.
Schematic of health trajectories (ICD-10 diagnoses, lifestyle and padding tokens at distinct ages), data splits (UK Biobank and Danish registries), and the modified GPT-2 architecture with age encoding, causal attention, and an exponential waiting-time head. Includes scaling laws and ablation results showing the contribution of architectural changes.
View Figure 1 in Nature →Predicted rates for nine exemplary diagnoses and death as a function of age; comparison with sex- and age-stratified incidence; average AUC by training occurrences and by ICD-10 chapter; ROC curves vs clinical and ML comparators; and comparison with MILTON biomarker-based model.
View Figure 2 in Nature →Design for simulating trajectories from age 60; modelled vs observed disease rates at 70–75 years; fraction of correctly predicted diagnoses over time; simulated vs observed fold changes for smoking, alcohol and BMI; and AUC of models trained on synthetic vs real data.
View Figure 3 in Nature →UMAP projection of token embeddings (diseases cluster by ICD-10 chapter); SHAP contributions for individual trajectories (e.g. pancreatic cancer risk and mortality); SHAP effect matrix across diseases and chapters; and rate of mortality over time after selected diagnoses.
View Figure 4 in Nature →AUC comparison between UK Biobank and Danish data; mortality estimates vs ONS national data (immortality bias); data source distribution and missingness; SHAP matrix by dominating data source showing learned biases (e.g. hospital-record exclusivity).
View Figure 5 in Nature →