Original authors: Changze Lv, Jiang Zhou, Siyu Long, Lihao Wang, Jiangtao Feng, Dongyu Xue, Yu Pei, Hao Wang, Zherui Zhang, Yuchen Cai, Zhiqiang Gao, Ziyuan Ma, Jiakai Hu, Chaochen Gao, Jingjing Gong, Yuxuan Song, Shuy

Published 2026-06-09

📖 5 min read🧠 Deep dive

CC BY 4.0

Original authors: Changze Lv, Jiang Zhou, Siyu Long, Lihao Wang, Jiangtao Feng, Dongyu Xue, Yu Pei, Hao Wang, Zherui Zhang, Yuchen Cai, Zhiqiang Gao, Ziyuan Ma, Jiakai Hu, Chaochen Gao, Jingjing Gong, Yuxuan Song, Shuyi Zhang, Xiaoqing Zheng, Deyi Xiong, Lei Bai, Wanli Ouyang, Ya-Qin Zhang, Wei-Ying Ma, Bowen Zhou, Hao Zhou

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). ⚕️ This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

Imagine you are trying to teach a computer how to design new proteins. Proteins are the tiny, complex machines inside our bodies that do everything from digesting food to fighting viruses. Designing a new one is like trying to invent a new key that fits a lock you've never seen before, but you have to get the shape of the key perfect, or it won't work.

The paper introduces AMix-1, a new "super-teacher" AI designed specifically to master this task. The authors didn't just build a bigger model; they built a smarter way of training it. Here is how they did it, explained through simple analogies:

1. The Engine: A "Denoising" Radio

Most AI models try to learn by reading clear text. AMix-1 is built on something called a Bayesian Flow Network. Think of this like a radio that starts out with only static noise.

How it works: The AI starts with a completely scrambled, noisy version of a protein sequence. It then tries to "clean up" the noise, step-by-step, to reveal the clear protein underneath.
The Analogy: Imagine trying to guess a song by listening to a very fuzzy radio station. At first, you only hear static. As the signal gets clearer (less noise), you start to hear the melody, then the lyrics. AMix-1 learns by practicing this "cleaning" process over and over until it can perfectly reconstruct a protein from pure noise.

2. The Roadmap: Scaling Laws (The "Size Matters" Rule)

The researchers wanted to know: "If we make the AI bigger and give it more data, will it get better?"

The Finding: They discovered a predictable "law of physics" for this AI. Just like a car engine gets more powerful with more fuel, AMix-1 gets better at understanding protein structures as you increase its size and the amount of data it sees.
The Metaphor: It's like climbing a mountain. The researchers mapped out the trail so precisely that they could predict exactly how high the AI would climb (how good it would get) just by knowing how much "fuel" (computing power) they put in. They used this map to build their biggest version, AMix-1, which has 1.7 billion "brain cells" (parameters).

3. The "Aha!" Moment: Emergent Abilities

Here is the most exciting part. The researchers found that the AI doesn't just get slightly better as it grows; it suddenly starts understanding things it didn't know before.

The Analogy: Imagine a student learning math. At first, they just memorize numbers. Then, suddenly, after enough practice, they "get it"—they understand the logic behind the numbers and can solve problems they've never seen.
The Result: As AMix-1 trained, it suddenly started understanding protein shapes (structure), even though it was only trained on the letters (sequence) of the protein. It didn't need to be explicitly taught geometry; it figured out the 3D shapes just by reading the sequence enough times.

4. The Shortcut: In-Context Learning (The "Show, Don't Tell" Trick)

Usually, to teach an AI a new task, you have to retrain it from scratch. AMix-1 is different. It can learn a new job just by looking at a few examples, without changing its brain.

The Analogy: Think of a master chef. If you want them to cook a specific regional dish, you don't need to send them to culinary school again. You just show them a few photos of the dish and say, "Make something like this." The chef uses their existing knowledge to figure it out.
How AMix-1 does it: The researchers give the AI a "family album" of similar proteins (called a Multiple Sequence Alignment). The AI looks at the patterns in that album and generates a new protein that fits right in, keeping the right shape and function.

5. The Real-World Test: The "Super-Enzyme"

The team didn't just run simulations; they tested this in a real lab.

The Challenge: They wanted to improve a protein called AmeR, which acts as a switch in genetic circuits. The wild-type (natural) version was okay, but they wanted it to be much stronger.
The Result: Using AMix-1's "show, don't tell" method, they designed a new version of AmeR. When tested in the lab, this new version was 50 times more active than the original. That's a massive jump, proving the AI can design real, working biological tools.

6. The Evolutionary Loop: Test-Time Scaling (The "Trial and Error" Engine)

Finally, they created a system called EvoAMix-1 to make the AI even better while it's working, without retraining it.

The Analogy: Imagine a sculptor who makes 100 clay statues. A judge picks the best 5. The sculptor then uses those 5 winners as a new "mold" to make the next batch of 100. They repeat this process, getting better and better with every round, without ever changing the sculptor's hands.
How it works: The AI generates many protein candidates. A "verifier" (a computer program or a lab test) picks the best ones. The AI then uses those winners as new examples to generate even better ones in the next round.
The Benefit: The more "checks" (budget) you give this system, the better the results get. It keeps improving as long as you let it keep trying.

Summary

AMix-1 is a new kind of protein designer that:

Learns by cleaning up noise (like tuning a radio).
Follows a predictable map where bigger is better.
Suddenly "understands" 3D shapes just by reading sequences.
Can learn new tasks instantly by looking at examples.
Successfully created a protein in a real lab that is 50x stronger than nature.
Can keep getting better through an evolutionary loop of trial and error.

The paper claims this is a major step toward a "lab-in-the-loop" future, where AI and real-world experiments work together to design life-saving proteins faster than ever before.

Technical Summary: AMix-1

Problem Statement

Current protein foundation models (e.g., AlphaFold, ESM) have advanced specific tasks like structure prediction and inverse folding but lack a unified, scalable methodology comparable to Large Language Models (LLMs). Unlike LLMs, which demonstrate consistent performance improvements through scaling laws, emergent capabilities, in-context learning, and test-time scaling, protein models have not yet established a systematic framework that leverages these principles to achieve scalable, high-fidelity protein design. The central challenge is to craft a protein foundation model that can systematically trade computational resources for performance, exhibit emergent structural understanding, and adapt to diverse design tasks without task-specific fine-tuning.

Methodology

The authors propose AMix-1, a protein foundation model built upon Bayesian Flow Networks (BFNs) and trained using a systematic pathway comprising four pillars:

1. Model Architecture and Training

Generative Framework: AMix-1 utilizes BFNs, which model the continuous parameters of protein sequence distributions via iterative Bayesian updates rather than directly modeling discrete amino acids. The process involves a sender distribution (perturbed by noise) and a receiver distribution (a neural network $\phi$ ) that infers messages to minimize the Kullback-Leibler (KL) divergence.
Architecture: The model series is based on an encoder-only Transformer with Rotary Position Embeddings (RoPE). The authors scaled the model from 8 million to 1.7 billion parameters.
Pretraining: Training was conducted on the UniRef50 dataset (41.5 million sequences) using a mixed-precision (BF16) setup. The noise level $\alpha$ is controlled by a schedule $\beta(t) = \beta_1 t^2$ .

2. Scaling Laws and Emergent Ability

Predictive Scaling: The authors established a predictive scaling law linking cross-entropy loss ( $L$ ) to computational cost (FLOPs), model size ( $N$ ), and data tokens ( $D$ ). They found that under moderate noise levels ( $\alpha = 0.16, 0.32$ ), the loss follows a power-law relationship $L(F) = E_0 + e^{a \cdot F^b}$ . Extreme noise levels ( $\alpha = 0.08, 0.64$ ) resulted in unscalable behavior.
Emergent Structural Understanding: The study reveals that structural understanding (measured by pLDDT and TM-score) emerges progressively as the training loss decreases below a critical threshold. This capability arises without explicit structural supervision, driven purely by sequence-level objectives and sufficient compute.

3. In-Context Learning (ICL)

Mechanism: AMix-1 unifies protein design by treating Multiple Sequence Alignments (MSAs) as prompts. Instead of fine-tuning, the model conditions on position-wise frequency profiles derived from MSAs.
Process: The MSA is compressed into a profile $P$ (a categorical distribution over amino acids at each position). AMix-1 then generates sequences $x$ conditioned on $P$ , preserving evolutionary constraints, structural integrity, and functional relevance.

4. Test-Time Scaling (TTS)

EvoAMix-1: To further enhance performance, the authors introduced an evolutionary test-time scaling algorithm. This framework operates as a proposer-verifier loop:
1. Propose: AMix-1 generates a batch of candidate sequences based on an initial MSA profile.
2. Verify: An external verifier (in silico metric or wet-lab assay) scores the candidates.
3. Update: The top- $k$ high-fitness variants are used to construct a refined profile (pseudo-MSA), which serves as the prompt for the next iteration.
Distinction: Unlike traditional methods that update model parameters (gradient-driven), EvoAMix-1 updates the condition (prompt) while keeping the model weights frozen (condition-driven), allowing for plug-and-play integration of diverse verification metrics.

Key Results

Scaling and Emergence

Model Scale: A 1.7-billion parameter model (AMix-1-1.7B) was successfully trained, demonstrating predictable scaling laws.
Emergence: Structural metrics (TM-score, pLDDT) showed a sharp increase (emergence) once cross-entropy loss dropped below a specific threshold, particularly at moderate noise levels ( $\alpha=0.16$ ). Extreme noise levels suppressed this emergence.

In-Context Learning Performance

In Silico: AMix-1 successfully generated proteins with high structural consistency (TM-score > 0.9) and functional specificity (high EC number confidence via CLEAN, preserved catalytic activity via CLIPZyme) across diverse scenarios, including orphan proteins and random homologs, without fine-tuning.
Wet-Lab Validation: The model was used to design variants of the AmeR transcriptional repressor. The best AMix-1-designed variant achieved a 50-fold increase in activity (fold repression) compared to the wild type, outperforming both single-mutant libraries and the state-of-the-art directed evolution method EvoAI.

Test-Time Scaling (EvoAMix-1)

Scalability: Across six in silico directed evolution benchmarks (structural, biophysical, and functional tasks), EvoAMix-1 demonstrated monotonic performance gains as the verification budget (number of verifier calls) increased.
Comparison: EvoAMix-1 outperformed or matched state-of-the-art baselines (ALDE, EVOLVEpro, MLDE) in 5 out of 6 tasks. It avoided the performance plateaus seen in methods with restricted mutation schemes, enabling broader exploration of the sequence space.

Significance and Claims

The paper claims to establish a systematic paradigm for protein foundation models that mirrors the success of LLMs. Its primary contributions are:

Methodological Unification: It introduces a roadmap for scalable protein design grounded in scaling laws, emergent capability analysis, in-context learning, and test-time scaling.
BFN Characterization: It is the first to characterize scaling laws and emergent abilities specifically for Bayesian Flow Networks in the context of protein modeling, identifying the critical role of noise scheduling.
Unified Design Framework: By leveraging MSA-based in-context learning, AMix-1 unifies structure- and function-guided design into a single framework without requiring task-specific fine-tuning.
Real-World Impact: The successful engineering of a high-activity AmeR variant (50x improvement) demonstrates the model's potential for real-world functional protein engineering.
Lab-in-the-Loop Potential: The EvoAMix-1 algorithm lays the groundwork for "lab-in-the-loop" protein design, where computational models can iteratively improve designs based on experimental feedback without retraining.

The authors acknowledge limitations, noting that AMix-1 is currently sequence-based (ignoring explicit structural data during training) and that test-time scaling has only been validated on simulated tasks, though they view these as immediate targets for future work.

AMix-1: A Pathway to Test-Time Scalable Protein Foundation Model