How AI Is Building The Future of Longevity Biomarkers
Applications of AI to Uncover Longevity Biomarkers
Introduction
Biomarkers are a key part of medicine, both for clinical purposes and pure research. How can we make progress developing treatments for a disease if we can’t even tell when a treatment is working or not? For aging research to progress and ultimately succeed, accurate and useful biomarkers will be critical. In this article, we provide a brief overview of longevity biomarkers, and then delve [0] into recent progress in creating biomarkers of aging using AI [1].
A Brief Review of Longevity Biomarkers
What even is aging anyway? Developing a mediocre answer is easy, but a precise answer is difficult. For health purposes we don’t care about chronological age per se, we care about biological age. The Biomarkers of Aging Consortium recently published a review (Moqri 2023) outlining a conceptual framework for research on the topic, and define “biological age” as “an individual’s age defined by the level of age-dependent biological changes, such as molecular and cellular damage accumulation”. This is notably different from chronological age. However, due to ease of measurement, a great deal of research focuses on chronological age.
Note that biological age includes both healthy and pathogenic processes. Stem cells differentiate into every cell type in the body, that maturation is a form of aging. We are primarily interested in damage caused by aging, but an accurate measurement of biological age includes both healthy maturation and unhealthy damage.
Since chronological age is a risk factor for many diseases, and methods trained for biological age are often trained on chronological age, we would expect “biological age” to be a risk factor purely through its correlation with chronological age. For a biological age measurement to be useful, it must be a better predictor than chronological age. Often studies will use the difference between predicted age and chronological age. There are a variety of terms for this in the literature, include “residual”, “age acceleration”, and “age deviation” or “AgeDev”.
Rutledge 2022 Figure 2. The “AgeDev” is labeled “Δage”
Which means that if one is creating a biological age clock, and training it to predict chronological age, it must not be too “accurate”, or else the AgeDev shrinks to zero (Zhang 2019) and we gain no useful information. An ideal clock would accurately predict the risk of mortality, not chronological age.
One of the earliest biomarkers of aging examined the methylation of CpG sites in the DNA (DNAm). Bocklandt 2011 found that just 3 sites could create a predictor with an average error of 5.2 years. Horvath 2013 worked with a larger dataset (in both features and samples), and examined 353 CpG sites across the genome, across different tissues, and found that chronological age could be predicted with an average error of 3.6 years. Interestingly, the sites were evenly divided between positively and negatively correlated. That is, some sites increased increased methylation with age, and others decreased. This is likely related to the suppression of genes needed only during development (and activation of genes needed only at maturity) rather than cellular senescence; the biomarker still correlates with chronological age even in immortal, non-senescent cells. Since then, much research on aging biomarkers have focused on DNA methylation. The Biomarkers of Aging Challenge (active now!) evaluation dataset uses the Illumina 450k chip, which measures 450,000 methylation sites. One recent review article (Fransquet 2019, Table 2) on epigenetic clocks found that “a higher [age acceleration] (per 5-year increase in age) was associated with an 8% and 15% increased risk of all-cause mortality” [2].
Epigenetics is not the only game in town though. Erraji-Benchekroun et al identified correlations between gene expression and aging in the brain all the way back in 2005. BiT age (Meyer 2021) used gene expression to create a chronological age clock in C. Elegans (roundworks), and also in humans. The same study included 10 patients with Hutchinson-Gilford progeria syndrome (HGPS), a disease primarily characterized by accelerated cellular aging. While we don’t have precise “ground truth”, the method predicted the HGPS patients to be substantially older, as one would hope for a biological age predictor. Lehallier 2019 built a linear model based on protein levels. Interestingly, they found three distinct “waves” of aging, peaking around ages 34, 60, and 78, demonstrating how aging is an uneven process even within an individual.
The Klemera-Doubal method combined proteomic, metabolomic, microbiomic, and even genomic data to measure biological age. I find the use of genomic data particularly interesting, since the DNA sequence that one is born with doesn’t change over time [3]. In this case, the authors developed polygenic risk scores based on genetic factors. Some people are just born to age faster or slower than others.
Clinical biomarkers are common as well. The Healthy Aging Index combines 10 components designed to measure organ function, ranging from the “forced vital capacity” test (eg exhale really hard) to simple blood pressure. Clinical biomarkers require less sophisticated technology to collect than molecular biomarkers, and often measure what we care about most directly. However we would also like biomarkers which work on animals, for research purposes. Some of these translate directly (eg fasting glucose) others not so much (eg the digit symbol substitution test). For molecular biomarkers we often have generalizability concerns for any particular biomarker, but they can always be collected and evaluated.
A Brief Sample of AI-based Biomarkers
A recent review (Wu 2023) identified over 500 FDA approved AI/ML tools. Prominent examples include HeartFlow (which builds a digital model of a heart based on a coronary CT scan), and the IDx-CR device which uses a trained convolutional neural network (CNN) to recognize diabetic retinopathy from retinal images.
As is often the case with new technology, adoption of these tests is highly uneven:
We observed that the presence of academic medical centers is a significant factor in the adoption of medical AI, as reflected in the fact that over 70% of zip codes with academic centers have at least one medical AI billing, compared with 9% in zip codes without such centers.
In a regression analysis, the presence of academic centers was more important than income or being in a metropolitan area. Medicine is a conservative field, and new technologies are slow to be adopted.
Histology involves staining and visually inspecting biopsies for signs of pathology. This is standard practice for most cancer patients, and the images are extremely high resolution. So there is a large volume of existing data available for training. Major tasks include overall classification (cancer vs no-cancer) and segmentation (highlighting the cancerous cells). Deep learning can even identify genomic mutations from these images (through morphological changes), something which human pathologists cannot reliably detect. Echle 2021 has more.
Writing software to do something humans can do is nice, but real advancements come when we can perform novel analyses. Braman 2021 combined embeddings from radiology, pathology, and genomics to predict the progression of glioma patients with better accuracy than any modality alone. The methodology trained separate neural networks on each modality, so the architecture could be customized properly. Afterwards the embeddings were “fused” into one big predictor. They also introduced a multimodal orthogonalization (MMO) loss function, so that representations from each modality could be encouraged to be independent, rather than repeating the same information acquired in different ways.
So that’s all cool stuff. How is AI being used in longevity research?
AI Longevity Biomarkers
As with non-AI methods, some of the earliest AI methods for aging clocks used DNA methylation as input features. DeepMAge used a straight-forward dense neural network architecture using DNAm beta vectors as input and (predicted) chronological age as an output. MethylNet [4] is also based on DNA methylation, although the authors first trained it as an autoencoder in a self-supervised manner. The major advantage of that strategy is it can be pre-trained on unlabeled data, which is typically available in much larger volume than labeled data (which is less of an advantage for merely predicting chronological age, since that label is often available, but mortality risk is much harder to come by). Another advantage is the autoencoder strategy can be used to create latent embedding vectors, essentially having the network perform dimensionality reduction. The latent vectors can then be combined with other features for predicting age, or other outcomes of interest.
Of course, research has been published using other modalities as well. Putin 2016 created an ensemble of deep neural networks, trained on 46 standardized blood markers. Holzscheck 2021 used a deep neural network with gene expression as input. The architecture was feed-forward, but with different pathways through the network, each capturing one of 50 selected gene sets. The sparsity of connections serves as a form of regularization, and since each network pathway is a biological process, the output is readily interpretable.
On the macro side, the FaceCnnAge model (Chen 2020) uses a convolutional neural network architecture to examine a 3D image of a face and estimate an age. Cole 2017 also used CNNs to examine images, but on MRI scans of the brain. Yin 2023 did the same, and demonstrated that the predicted biological age correlated much more strongly with cognitive impairment (from mild to Alzheimer's disease) than chronological age.
AI methods typically require large amounts of training data, and their comparative advantage (compared to classical ML methods) strongly correlates with the complexity of the underlying data. In my humble opinion, using AI on a single, straightforward data type is overkill. The real advantage comes when analyzing complex multidimensional data. This means 3D images, or the combination of multiple modalities.
Precious1GPT (Urban 2023, code on GitHub) used a transformer architecture capable of using gene expression and methylation data, and trained it to predict chronological age. This model was then fine-tuned to classify case vs control for 4 age-related diseases (idiopathic pulmonary fibrosis, chronic obstructive pulmonary disease, Parkinson’s disease, and heart failure). This model measured induced pluripotent stem cells as “younger”, and predicted age decreased as time since induction increased.
One of the more interesting papers I’ve read this year was “Using sequences of life-events to predict human lives” (pre-print). Vector embeddings are a common technique these days, used for natural language, images, and basically any complex data which we would like to simplify. Savcisens et al used a transformer architecture and generated a vector space for life-events. Jobs, job less, income level, birth year, health events, you name it. They named the method “life2vec”.
Savcisens et al Figure 2: Two-dimensional projection of the concept space (using PaCMAP)
The authors used this method to predict mortality, and found the performance of their method outperformed other baseline methods. It is worth mentioning that they tested 5 total methods, and the 3 neural network methods substantially outperformed simpler methods.
Combining multiple data types is always tricky, that’s one benefit of vector embeddings. A simple thing to do is just concatenate vectors, we could also transform and/or take weighted averages. life2vec doesn’t incorporate any biomarkers, but combining the embeddings with biomarker embedding methods (eg MethylNet or or Precious1GPT) would be straightforward. Mathematically, at least. Collecting the training dataset becomes significantly more complicated.
Conclusion
So what did we learn? We learned that DNA methylation measurements dominate the aging biomarker landscape. They are not likely to be replaced by a different modality, but rather methods which combine multiple modalities. Combining multiple data-types is where AI methods excel. By weaving together information from genomics, proteomics, metabolomics, and beyond, AI-driven methodologies offer a holistic approach to biomarker development. With each advancement, we inch closer to unraveling the complexities of aging and unlocking the secrets to a longer, healthier life.
Footnotes
[0] Less than 1% of this article was written by ChatGPT, although it did suggest this particular word. Maybe PaulG was right.
[1] As I was writing this piece, a nice review article was published on the same subject.
[2] Though the same article found substantial publication bias.
[3] Technically DNA can accumulate mutations, but the change is much smaller than other markers.
[3] Note for the software-inclined: The authors published the code and data for MethylNet, on Code Ocean and GitHub