Several multiple sequence alignment perturbation methods enhance AlphaFold3 sampling of alternative protein states

This study demonstrates that multiple sequence alignment perturbation strategies significantly enhance AlphaFold3's ability to sample alternative protein conformational states, often outperforming AlphaFold2 and matching the BioEmu model.

Eriksson Lidbrink, S., Nissen, I., Ahrlind, J. K., Howard, R. J., Lindahl, E.

Published 2026-04-03
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are trying to predict the shape of a complex, squishy toy (a protein) that can twist and turn into different poses to do its job. For a long time, the best AI tool we had, AlphaFold2, was like a very smart photographer who could take a perfect picture of the toy in its "resting pose," but it always forgot to take pictures of the toy when it was stretching, dancing, or working. It only gave you one photo.

Then, AlphaFold3 arrived. It's like a newer, more advanced camera that theoretically knows how to take a whole series of photos showing the toy in all its different poses. But, in practice, it still tended to just take the same "resting pose" photo over and over again, missing the action shots.

This paper is about a team of scientists who asked: "How can we trick AlphaFold3 into taking those missing action photos?"

The Problem: The "Echo Chamber"

Think of the data AlphaFold uses (called an MSA) as a massive library of instructions written by thousands of different species over millions of years. Usually, the library is so loud and crowded with instructions for the "resting pose" that the AI can't hear the quiet whispers about the other poses. It gets stuck in an echo chamber, only seeing what it already expects.

The Solution: The "Noise" Tactics

The researchers tried three different ways to create "noise" in the library to force the AI to look elsewhere. They used creative metaphors for these methods:

  1. The "Crowd Control" (Stochastic Subsampling): Imagine the library has 1,000 people shouting instructions. The AI listens to all of them and gets confused by the loudest voice (the resting pose). The scientists tried turning down the volume or asking only 10 people to speak. With fewer voices, the dominant "resting pose" instructions get quieter, allowing the AI to hear the quieter instructions for the "dancing pose."
  2. The "Grouping Game" (Clustering): Instead of listening to everyone at once, they sorted the 1,000 people into different groups based on how similar they sounded. They then asked the AI to listen to just one group at a time. Maybe Group A only knows the resting pose, but Group B knows the dancing pose. By separating them, the AI gets a fresh perspective.
  3. The "Blindfold" (Column Masking): This was the most interesting trick. Imagine the instructions are written in columns of letters. The scientists took a marker and covered up (masked) random letters in the instructions with a generic "X".
    • The Magic: When they used a standard "X" (unknown), it helped a bit. But they discovered that if they used a specific letter, like "F" (Phenylalanine), to cover the instructions, it sometimes acted like a secret key. It forced the AI to reconstruct the protein in a completely different shape, revealing a pose it had never seen before.

The Results: A New World of Shapes

The team tested these tricks on over 100 different proteins. Here is what they found:

  • AlphaFold3 is already great: Even without any tricks, the new AI was much better at seeing different shapes than the old AlphaFold2. It was like upgrading from a black-and-white camera to a 4K color camera.
  • The tricks make it even better: Using the "Crowd Control" and "Blindfold" methods helped the AI find even more of the missing poses. In about 20% of cases, these tricks were the difference between finding a pose and missing it entirely.
  • The "F" Mask Surprise: In one specific case (an RNA helicase, which is like a molecular zipper), the standard "X" blindfold failed completely. But when they used the "F" blindfold, the AI suddenly found the "apo" state (the empty state), which it had completely ignored before. It's like trying to find a hidden door in a house; sometimes you need to knock on the wall with a specific rhythm (the "F" mask) to hear the click.
  • Beating the Competition: They compared their method to another AI called BioEmu, which was specifically trained to guess all possible shapes. Surprisingly, the simple "noise" tricks applied to AlphaFold3 worked just as well, and sometimes better, than this specialized competitor.

Why Does This Matter?

Proteins are like machines that need to move to work. If you only know what a machine looks like when it's turned off, you can't fix it or build a better version of it.

By using these simple "noise" tricks, scientists can now use AlphaFold3 to generate a movie of a protein's life rather than just a single snapshot. This helps drug designers understand how proteins move, potentially leading to better medicines that can target specific moments in a protein's dance.

In short: The researchers found that by intentionally "messing up" the data AlphaFold3 reads, they can actually help it see the full picture, revealing the hidden, dynamic shapes of life's building blocks.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →