Imagine a language as a giant, bustling marketplace where every sound (like "b," "k," or "sh") is a different type of fruit. Some fruits are everywhere (like apples), while others are rare (like durians).
For a long time, linguists have noticed two strange rules about how these "fruits" are distributed in languages around the world:
- The "Rich Get Richer" Pattern: A few sounds are used constantly, while most are used very rarely. This creates a specific curve on a graph.
- The "Balance" Mystery: Languages with many different sounds (a huge fruit market) tend to have a very predictable, low-energy distribution. Languages with few sounds have a more chaotic, high-energy distribution. It's as if the universe is trying to "balance the books" between the number of sounds and how they are used.
The big question this paper asks is: Did these patterns happen because languages are smart and trying to optimize themselves? Or did they just happen by accident over thousands of years of history?
The authors, Fermín and Suchir, decided to find out by building a time machine for sounds.
The Time Machine Experiment
They created a computer simulation that acts like a "sound evolution" video game. They started with a bunch of imaginary languages, each with a standard set of sounds. Then, they let time run forward, introducing random "accidents" to the sounds, just like real history:
- Splitting: One sound accidentally breaks into two (like a cell dividing).
- Merging: Two sounds accidentally crash into one (like two cars merging into a single lane).
- Shifting: A sound changes slightly into another existing sound.
They ran three different versions of this game to see which one matched real human languages.
Level 1: The Naïve Chaos (The "Roll the Dice" Model)
The Setup: They let the sounds change completely at random. Every sound had an equal chance of splitting or merging, like rolling a die to decide what happens next.
The Result: The "fruit market" looked okay at first glance—it had that "rich get richer" curve. But, it failed the second test. In this simulation, languages with more sounds ended up being more chaotic, not less. It was the opposite of real life. Also, the number of sounds in these languages kept growing or shrinking wildly, with no limit. It was like a market that kept adding new exotic fruits forever or losing them all until only two remained.
Level 2: The "Functional Load" Twist (The "Popular vs. Obscure" Model)
The Setup: The researchers added a rule based on reality: Rare sounds are more fragile. If a sound is rarely used, it's more likely to disappear or merge with another. Common sounds are "sturdy" and stick around.
The Result: This made the "fruit market" even more skewed. The common fruits became super common, and the rare ones vanished. But, it still failed the second test. The relationship between the number of sounds and the "chaos" of the distribution was still wrong. The languages were still drifting apart in size without any limit.
Level 3: The "Goldilocks" Stabilizer (The Winning Model)
The Setup: The researchers realized that real languages don't have infinite sounds or just two sounds. They tend to hover around a "sweet spot" (about 30–40 sounds for most languages). They added a rule: If a language gets too big, it's harder to add new sounds. If it gets too small, it's harder to lose sounds. It's like a thermostat that tries to keep the room at a comfortable temperature.
The Result: Bingo! This simple rule fixed everything.
- The languages settled into a realistic size range (no more infinite markets).
- The "fruit distribution" looked exactly like real languages.
- Crucially: It recreated the "Balance Mystery." Languages with more sounds naturally became more predictable, and languages with fewer sounds were more chaotic.
The Big Takeaway: Accidental Order
The most exciting part of this discovery is why it happened.
The authors found that they didn't need to program the languages to be "smart" or "efficient." They didn't need a rule saying, "Hey, if you have too many sounds, you must organize them better!"
Instead, the "balance" emerged naturally as a side effect. It's like a crowded dance floor:
- If the room is small (few sounds), people bump into each other constantly (high chaos).
- If the room is huge (many sounds), people spread out and move more predictably (low chaos).
The "Compensation" we see in languages isn't necessarily a conscious effort by speakers to optimize their speech. It's just the natural result of sounds evolving randomly over time, while being gently nudged back toward a comfortable, average size.
In short: The universe didn't need a master planner to organize the sounds of human language. It just needed a little bit of random history and a gentle "thermostat" to keep things from getting too wild. The patterns we see are the natural footprints of time.