llustrative image of a genetics lab workstation.” Credit: “Image: Mount Sinai Health System”
A research team at the Icahn School of Medicine at Mount Sinai has built a machine-learning variant scoring system that combines a general “pathogenic vs. benign” estimate with broad phenotype-linked probabilities, with the goal of narrowing the list of candidate disease variants returned by sequencing.
The tool, Variant-to-Phenotype (V2P), is described in “Expanding the utility of variant effect predictions with phenotype-specific models,” published Nov. 28, 2025 in Nature Communications. The news was also the focus of a Mount Sinai news release.
Why phenotype context matters
Clinical sequencing can produce thousands of variants to triage. Many computational predictors focus on a single output: how likely a variant is to be damaging in general. V2P conditions its predictions on phenotype groupings from the Human Phenotype Ontology (HPO), so the output can be used to prioritize variants that match a patient’s presentation, a common bottleneck in rare-disease workups.
How V2P works
V2P is an ensemble, multi-task model that ingests variant annotations spanning gene-level features (disease and pathway associations), protein-sequence and structure features, protein interaction network features, and variant-level measures such as conservation.
Its output comprises 24 values from 0 to 1: one general pathogenicity probability plus 23 probabilities tied to first-level “phenotypic abnormality” classes in HPO (examples in the paper include abnormalities of the nervous system and neoplasms).
For training, the authors report using 252,125 pathogenic variants from HGMD and 244,231 putatively benign variants from gnomAD v2.1.1 exomes, spanning 6,620 genes.
Benchmarking highlights
To test whether the model aligns with experimental evidence, the authors compare predictions with results from deep mutational scanning assays across dozens of proteins, and with massively parallel reporter assay (MPRA) datasets for regulatory elements. They report V2P achieves the highest median correlation across the MPRA sets they tested, with performance in the same range as CADD and above FATHMM on the reported comparisons.
To approximate a diagnostic workflow, the team ranked variants in patient exomes using the patient’s pathology alongside V2P phenotype scores, including an evaluation in 116 exomes from patients with rare immune disorders.
In the Mount Sinai Health System release, the team also says V2P often placed the true disease-causing variant among the top 10 candidates in tests on de-identified patient data.
R&D angle: from diagnosis to targets
Beyond case review, the researchers argue phenotype-conditioned scoring could help connect genes and pathways to disease classes, information that can be useful when selecting targets or prioritizing mechanisms for follow-up.
The phenotype outputs are intentionally broad (23 top-level HPO categories), and the authors note that limited labeled data makes finer-grained, supervised phenotype prediction difficult today.
They also discuss evaluation pitfalls such as circularity between training and test data as a general issue in this area. The paper says the team precomputed V2P scores for all possible SNVs in hg38 and for gnomAD indels, and they provide a web framework for scoring user-specified variants.