Foundation Models Improve Perturbation Response… — Plain-Language Explanation

Original authors: Cole, E., Huizing, G.-J., Addagudi, S., Ho, N., Hasanaj, E., Kuijs, M., Johnstone, T., Carilli, M., Davi, A., Ellington, C., Feinauer, C., Li, P., Menegaux, R., Mohammadi, S., Shao, Y., Zhang, J., Lun

Published 2026-02-19

📖 5 min read🧠 Deep dive

View on bioRxiv ↗PDF ↗

⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are a chef trying to predict exactly how a cake will taste if you swap out one ingredient (like changing sugar to honey) or add a new spice. In biology, scientists face a similar challenge: they want to predict how a living cell will react when you "tweak" it—either by turning a gene off (like removing an ingredient) or adding a chemical drug (like adding a spice).

This paper is a massive taste-test competition to see which "recipe books" (AI models) are best at predicting these changes.

Here is the story of what they found, explained simply:

1. The Great "AI Recipe" Debate

For a few years, scientists have been arguing about "Foundation Models." These are huge, super-smart AI systems trained on mountains of biological data (like reading millions of biology textbooks).

The Skeptics said: "These giant AI models are overkill. A simple calculator or a basic chart works just as well."
The Believers said: "No way! The AI has learned the secret language of life and can predict things simple tools can't."

The authors of this paper decided to settle the argument by testing over 600 different models to see who actually wins.

2. The Results: Not All AI is Created Equal

They ran the tests on two main types of "ingredients":

Genetic Tweaks: Changing the cell's DNA (like swapping a gene).
Chemical Tweaks: Adding drugs or chemicals (like adding a spice).

The Big Discovery:
It turns out, both sides were partially right.

Some of the fancy AI models were indeed just as good as a simple calculator. They were like a robot chef who forgot how to read the recipe.
However, other models were amazing. They didn't just guess; they understood the deep connections between ingredients.

The Secret Sauce:
The best-performing models weren't the ones that just read DNA sequences or protein shapes. The winners were models trained on "Interaction Maps" (called Interactomes).

Analogy: Imagine trying to predict what happens if you remove a player from a soccer team.
- A model that just looks at the player's stats (DNA) might guess wrong.
- A model that knows who passes the ball to whom (the interaction map) will know exactly how the team's strategy will collapse.
- The paper found that models trained on these "social networks" of proteins were the champions.

3. The "Fine-Tuning" Trap

Usually, when you get a powerful AI, you "fine-tune" it (teach it specifically for your task). The researchers tried this, but they found a problem.

Analogy: Imagine you have a genius chef who knows how to cook 10,000 dishes. You try to teach them to make one specific cake using only 50 samples of that cake.
The Result: The chef gets confused and starts memorizing the 50 samples instead of learning the general rules. This is called overfitting.
The Lesson: For biology, sometimes it's better to use the AI's "frozen" knowledge (the general rules it already learned) rather than trying to retrain it on small datasets.

4. The Power of Teamwork (Fusion)

The researchers realized that no single "recipe book" has all the answers. Some know about DNA, some know about protein shapes, and some know about how proteins talk to each other.

They built a "Team Captain" model (an attention-based fusion model). This model acts like a conductor, listening to all the different AI experts and combining their opinions.
The Outcome: When they combined the best AI models, the predictions became incredibly accurate. For some cell types, the AI predictions were so good they hit the "theoretical limit"—meaning they were as accurate as the experiments themselves (which always have some noise).

5. The Chemical Challenge

Predicting what happens when you add a drug (chemical) is much harder than changing a gene.

Analogy: Changing a gene is like swapping a specific Lego brick. Adding a drug is like throwing a handful of mystery dust into the mix; it might stick to one thing, or ten things, or nothing at all.
The AI models struggled more here. While they could predict genetic changes very well, the "chemical" predictions were still a bit fuzzy. The paper suggests we need better "dictionaries" for drugs that explain how they interact with biology, not just their chemical structure.

The Bottom Line

This paper is a victory for the "Believers," but with a caveat.

Yes, Foundation Models can revolutionize biology and help us design new drugs faster.
But, you have to pick the right kind of AI (one that understands biological networks) and know when to use it without over-training it.
Best Strategy: Don't rely on just one AI. Combine the best ones together, and you get a prediction engine that is nearly perfect.

In short: AI is ready to help us simulate life, but we need to give it the right maps and let the experts work together.

Foundation Models Improve Perturbation Response Prediction

1. The Great "AI Recipe" Debate

2. The Results: Not All AI is Created Equal

3. The "Fine-Tuning" Trap

4. The Power of Teamwork (Fusion)

5. The Chemical Challenge

The Bottom Line

1. Problem Statement

2. Methodology

Datasets

Embedding Strategies

Prediction Frameworks

Integration and Fine-Tuning

3. Key Contributions & Results

A. Embedding Modality Matters Most

B. Genetic vs. Chemical Perturbations

C. Fine-Tuning vs. Frozen Embeddings

D. Multimodal Fusion

E. Advanced Generative Models

4. Significance and Implications

Conclusion

Foundation Models Improve Perturbation Response Prediction

1. The Great "AI Recipe" Debate

2. The Results: Not All AI is Created Equal

3. The "Fine-Tuning" Trap

4. The Power of Teamwork (Fusion)

5. The Chemical Challenge

The Bottom Line

1. Problem Statement

2. Methodology

Datasets

Embedding Strategies

Prediction Frameworks

Integration and Fine-Tuning

3. Key Contributions & Results

A. Embedding Modality Matters Most

B. Genetic vs. Chemical Perturbations

C. Fine-Tuning vs. Frozen Embeddings

D. Multimodal Fusion

E. Advanced Generative Models

4. Significance and Implications

Conclusion

More like this