
AlphaFold works because it treats structure as an engineering problem, not a simulation.Getty Images
Artificial intelligence isn’t just changing how we search the web or generate images. It is reshaping how scientists approach one of biology’s oldest problems: the folding of proteins.
Google DeepMind introduced AlphaFold in 2018 as a system for predicting protein structures. But it was AlphaFold 2, released in 2020, that fundamentally altered the field. The system achieved near-experimental accuracy in predicting protein structures, solving a problem that had resisted decades of computational and experimental effort.
Since then, AlphaFold has moved from a research milestone to a core scientific infrastructure. The tool is now used by more than three million researchers across nearly 190 countries, and its predictions underpin work in structural biology, drug discovery, and enzyme engineering.
At its core, AlphaFold predicts a protein’s three-dimensional structure directly from its amino acid sequence using deep learning. This matters because a protein’s structure determines how it functions—what it binds to, how it catalyzes reactions, and how it fails. Structural insight is essential for understanding disease mechanisms and designing drugs.
What once required months or years of experimental work using X-ray crystallography, cryo-electron microscopy, or NMR spectroscopy can now often be achieved computationally in hours, provided suitable sequence data exists.

How amino acid chains fold into 3D protein structures. Credit: Kep17/Wikimedia Commons.
But speed is only part of the tale. The real engineering feat lies in AlphaFold’s neural network architecture. Combining advanced attention mechanisms with iterative refinement allows AlphaFold to overcome technical bottlenecks that have existed for decades.
Biology depends on proteins to execute essential tasks, from cellular signaling to metabolism. Proteins start as one-dimensional amino acid chains before spontaneously morphing into complex 3D structures.
How it folds decides what it does, what it binds to, and how it performs biologically. Additionally, misfolding leads to serious diseases. Therefore, understanding how proteins fold is important for drug discovery and enzyme engineering applications.
Determining protein structures relied on experimental methods for decades—X-ray crystallography, cryo-electron microscopy (cryo-EM), and nuclear magnetic resonance (NMR) spectroscopy. These methods demand specialized equipment, significant time, and expertise because of their complex physical processes—often months or years per protein.
By predicting protein structures from sequences using algorithms, computational approaches promised a shortcut that could bypass expensive experiments.
Supercomputers needed weeks to simulate what amounted to microseconds of folding. Other techniques, like homology modeling, predicted structures by comparing to relatives but died without close homologs.
Template-free methods existed, but they proved less reliable. Their predictions were only about 40 to 60 percent accurate, which is not sufficient for drug design or detailed studies.
AlphaFold 2’s innovation wasn’t just about more computing power or bigger datasets. The model introduced a new way of predicting protein structures, grounded in three core innovations: evoformer, invariant point attention, and recycling.
The Evoformer serves as AlphaFold’s main neural network architecture. It is a pattern recognition engine that processes two key inputs simultaneously. First, it analyzes multiple sequence alignments (MSAs) showing how a protein has evolved across species.
Second, it handles pairwise relationships between amino acids. These MSAs matter because amino acids that are close in the 3D structure tend to co-evolve—mutations in one position are often matched by compensatory changes nearby to preserve folding. The Evoformer’s attention mechanisms learn to spot these patterns.
This module stacks 48 blocks that update both MSA and pairwise representations repeatedly. Within each block, specialized attention layers have different jobs. MSA row attention focuses on individual sequences, while MSA column attention spots evolutionary patterns across them.

The architecture of AlphaFold 2. Credit: John Jumper et al/Wikimedia Commons.
Triangle attention then applies a critical geometric constraint: the triangle inequality, which keeps predicted amino acid distances physically realistic.
The Structure Module transforms Evoformer outputs into 3D coordinates using invariant point attention (IPA). Unlike traditional attention that processes abstract vectors, IPA works directly in 3D space. It treats each amino acid as a rigid triangle formed by key backbone atoms and ensures predictions remain the same regardless of rotation or translation, because a protein’s shape is independent of its orientation.
IPA updates these triangles over eight iterative layers, considering the Evoformer’s abstract features, current 3D positions, and geometric relationships. This 3D-aware attention enables AlphaFold to observe physical constraints like bond lengths and angles without explicitly writing in all the rules.
AlphaFold doesn’t stop at a single guess. Its recycling mechanism feeds its own predictions back into the network multiple times—typically three rounds during inference—refining the structure step by step. The process produces highly accurate predictions by correcting rough folds, aligning secondary structures, and resolving clashes each time.
Trained end-to-end on experimentally determined structures from the Protein Data Bank (PDB), AlphaFold uses multiple loss functions to penalize errors in distances, angles, clashes, and geometry. Notably, the Frame Aligned Point Error (FAPE) loss helps the model nail local atomic positioning, encouraging precise geometry over just global similarity.
Additionally, AlphaFold employs masked MSA training—like BERT models in language processing—randomly hiding parts of sequence alignments during training. By doing this, the model picks up robust evolutionary patterns, allowing it to predict structures even when data is limited.
AlphaFold has had an undeniable impact on science.
In recognition of this work, Demis Hassabis and John Jumper of Google DeepMind were awarded the 2024 Nobel Prize in Chemistry, shared with David Baker for advances in protein design.
But AlphaFold’s significance goes beyond awards and accolades. The system has sped up drug discovery by producing highly accurate protein models that inform molecular design and target validation. The system has been used to identify potential drug targets for diseases such as COVID-19 and neglected tropical diseases, for instance.
AlphaFold’s structural insights have aided enzyme engineering efforts, helping researchers redesign enzymes to improve catalytic efficiency for biomanufacturing and environmental remediation.
Outside of medicine, AlphaFold has assisted in plant biology by helping to decipher photosynthetic materials, which are critical to cellular function.
Building on the success of AlphaFold 2, the team released AlphaFold 3 in 2024. Now the system can not just predict single-protein structures but also how proteins interact with small molecules, RNA, DNA, and metal ions.

AlphaFold 2’s performance on the CASP14 dataset. Credit: John Jumper et al/Wikimedia Commons.
AlphaFold can now tackle more complex biological processes like enzymatic activity modulation and gene regulation, thanks to this added ability.
Even with its successes, AlphaFold faces several key challenges.
According to current research, AlphaFold has difficulty modeling dynamic processes like protein folding pathways and the conformational changes essential for biological function. Predicting structures of multi-protein complexes and transient interactions also remains difficult due to their dynamic nature PMC 2024.
Furthermore, AlphaFold’s dependence on evolutionary sequence data constrains its accuracy for proteins with sparse homologs or novel folds. It may also predict dominant conformations from training data rather than alternate functional states.
To tackle these limitations, researchers are combining AlphaFold with experimental methods like cryo-electron microscopy and computational tools such as molecular dynamics simulations. These combined methods aim to give a fuller picture of protein behavior and interactions in physiological contexts.