Machine learning methods can predict chemical compounds capable of extending lifespan in model organisms.
Many diseases of aging have an unmet need, and it is vital we develop interventions to target the biological processes of aging in order to prevent and cure such disease, extend healthspan and reduce the burden on overladen healthcare systems.
Non-pharmacological interventions such as genetic interventions and dietary restrictions are effective in extending the lifespan of model organisms. However, they have certain limitations; very few people willingly undergo dietary restrictions for a long duration, and the application of genetic interventions is difficult in humans. Perhaps it is time for the field to move on – after all, the strongest longevity result for dietary restriction in mice was published in 1986 [1]. This has led pharmacological interventions to be seen as some as the most effective type of antiaging intervention.
Longevity.Technology: A wide range of compounds have been identified through in vivo experiments that extend the lifespan of model organisms, especially the DrugAge database comprising information about 1096 compounds. However, it is not possible to manually analyze large volumes of data in DrugAge or other databases. Enter machine learning – ML algorithms can help in the analysis of data in such databases.
A new study published in Aging developed datasets using four different types of features that described properties of chemical compounds (including drugs) and proteins that interacted with those compounds. They then used supervised machine learning (ML) methods to predict whether a compound could extend the lifespan of C elegans.
The researchers created datasets using four types of predictive features that were based on drug (compound)-protein interactions, Gene Ontology (GO) terms, physiology terms from a Phenotype Ontology for C elegans, as well as proteins specially encoded by aging-related genes and their interactions between compounds. However, only GO terms have been used widely in the prediction of lifespan-extending compounds. They also used an approach that selected the best filter method for feature selection from five filter methods as well as carried out biological analysis of the most important predictive features.
Two versions of the dataset were created, for ‘version 1’ every compound-protein interaction stored on the STITCH database was used for the creation of predictive features, while in ‘version 2’, only those compound-protein interactions that had a confidence score of at least 45% in STITCH were used in the creation of predictive features. The version 1 datasets were reported to have more noise since it is less reliable as compared with version 2 datasets. This led to the evaluation of the best models from version 1 (GOTerms_1 and the Interactors_1 models) and version 2 datasets (GOTerms_2 and Interactors_2 models) on an external D. melanogaster dataset. The results indicated that the two best models from the version 2 dataset possessed better generalization ability even on an external dataset [2].
The results also indicated that for the GOTerm_1 model, most compounds with the GO term ‘Respiratory chain complex II assembly’ were life-extending or positive class compounds. The top GO term was found to be the mitochondrial respiratory chain complex which indicated its role in aging and longevity-targeted pharmacology. For the GOTerm_2 model, most compounds with the ‘Glutathione metabolic process’ were indicated to be positive class compounds.
“One noteworthy feature was the GO term “Glutathione metabolic process”, which plays an important role in cellular redox homeostasis and detoxification,” the authors explain [2].
The platform also predicted the most promising novel compounds for extending lifespan from a list of previously unlabelled compounds, and one compound that made the list was nitroprusside, which is used as an antihypertensive medication.
Other compounds that were reported to have the highest probability of belonging to the life-extension class from both models included Streptomycin, Ferric cation, Potassium hydrogen DL-aspartate, Flavin adenine dinucleotide, and NADH [2].
The study is, therefore, able to predict promising compounds that could help in the extension of lifespan from previously unlabelled compounds.
“Overall, our work opens avenues for future work in employing machine learning to predict novel life-extending compounds,” conclude the authors [2].
Future research must include lab experiments with C elegans to confirm such computational predictions.
It is worth repeating that DrugAge contains only 1097 drugs. As Matt Kaeberlein tweeted: “Imagine what we might find if we quantitatively, rigorously tested 1,000,000 interventions for lifespan and health effects.”