Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer
Imagine you are trying to count every single possible Lego castle you could ever build. You might think, "Well, there are so many ways to snap bricks together that the number is basically infinite." Scientists have tried to guess this number before, often saying there are about 10 to the power of 60 (a 1 followed by 60 zeros) "drug-like" molecules. But those guesses had a flaw: they counted every possible combination of bricks, even the ones that would fall apart immediately or make no sense physically. They didn't ask, "How hard is it to actually build this?"
This paper introduces a new way to count the universe of possible molecules using a concept called Assembly Theory. Think of it not just as counting the final castle, but as counting the minimum number of steps required to build it.
Here is the breakdown of their findings using simple analogies:
1. The "Instruction Manual" Metric
Imagine you have a specific molecule. To build it, you need a set of instructions.
- The Old Way: Just count how many atoms are in the molecule.
- The New Way (Assembly Theory): Count the minimum number of "snap-together" moves needed to construct it from scratch.
- If you have a long chain of identical beads, you can build it quickly by duplicating a small chunk over and over. This is a "low complexity" object.
- If you have a molecule where every single part is unique and you have to attach them one by one, that takes many more steps. This is a "high complexity" object.
The authors call this number of steps the Assembly Index. It's like a "difficulty rating" for building a molecule.
2. The "Lego Universe" vs. The "Real World"
The paper distinguishes between two spaces:
- The Assembly Universe: This is the theoretical space of every possible shape you could make with Lego bricks, even if the shape is unstable or impossible to hold together in real life.
- Chemical Space: This is the "Real World" subset. It only includes molecules that are physically stable and can actually exist (like the ones in the GDB-13 database, which contains nearly 1 billion real-world drug-like molecules).
The researchers used the GDB-13 database as a map to see how big the "Real World" chemical space actually is.
3. How Fast Does the Space Grow?
The big question was: As the "difficulty rating" (Assembly Index) goes up, how fast does the number of possible molecules explode?
- The Finding: The number of possible molecules grows super-fast.
- It grows faster than a standard exponential curve (like compound interest).
- It grows at a rate somewhere between "super-exponential" and "double-exponential."
- The Analogy: If you imagine the number of molecules as a balloon, standard growth is like blowing it up slowly. This paper suggests the balloon is inflating so fast it's practically exploding.
4. The "Filter" Effect
The paper also looked at what happens when you put "filters" on the Lego set.
- No Rings: If you only allow straight chains of atoms (no loops), the space grows in a specific way.
- With Rings: If you allow atoms to form loops (rings), the molecules tend to be more "symmetric" (easier to build by copying parts), which changes how the space grows.
- Specific Motifs: If you demand a molecule must have a specific shape (like a square ring), the space shrinks, but it's still astronomically huge.
5. The Final Count
When the researchers applied all the standard rules for "drug-like" molecules (things like: must be under a certain weight, must be stable, must have specific types of atoms) and looked at molecules with an Assembly Index of 25, they calculated the size of this space.
The Result: There are approximately 10 to the power of 117 possible molecules.
To put that in perspective:
- The previous estimate was 10^60.
- The new estimate is 10^117.
- That is a number so large it dwarfs the number of atoms in the entire observable universe.
Summary
The paper argues that the "universe of possible molecules" is not just big; it is mind-bogglingly vast, and it grows at a terrifyingly fast rate as complexity increases. By using a "step-counting" method (Assembly Theory) instead of just counting atoms, they found that even with strict rules for what makes a good drug, the number of possibilities is roughly 10^117. This suggests that finding a specific, useful molecule in this ocean of possibilities is an incredibly difficult task, simply because the ocean is so much bigger than we previously thought.
Drowning in papers in your field?
Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.