Overdispersed and Markovian Children

This paper challenges the common assumption that children's genders follow a simple independent coin-toss model by demonstrating through data that birth genders exhibit slight imbalances, family-specific variations, sequential dependencies, and overdispersion, while also illustrating how sample sizes impact statistical detection power.

Nils Lid Hjort

Published 2026-04-14
📖 5 min read🧠 Deep dive

The Big Picture: Are We Just Flipping Coins?

Imagine you are walking down the street or looking at your own family. You see boys and girls. It feels like nature is flipping a fair coin for every baby: Heads = Boy, Tails = Girl. If this were true, every family would be a perfect roll of the dice. If you had 8 kids, you'd expect a mix, and families with only boys or only girls would be incredibly rare.

But Nils Lid Hjort, a statistician from Oslo, took a giant magnifying glass to some very old data from 19th-century Germany (Saxony). He looked at nearly 38,500 families that had at least 8 children. What he found was that the "coin" nature flips isn't actually perfectly fair, and it doesn't behave the same way for every family.

Here are the four main discoveries from the paper, explained simply:


1. The Coin is Slightly Weighted (The "0.485" Secret)

The Analogy: Imagine a casino where the house always wins just a tiny bit.
The Reality: We often think the chance of having a girl is exactly 50/50. Hjort's data shows it's actually about 48.5%.
Why it matters: To prove this tiny difference (48.5% vs. 50%) isn't just a fluke, you need a massive amount of data. It's like trying to tell if a coin is slightly bent. If you flip it 10 times, you might get 6 heads and 4 tails and think, "No big deal." But if you flip it 15,000 times and get that same ratio every time, you know the coin is bent. The paper shows that with modern data, we can finally be 100% sure the "girl coin" is slightly heavier than the "boy coin."

2. The "Family Personality" (Overdispersion)

The Analogy: Imagine a classroom of students taking a test.

  • The Simple View (Binomial): Every student has the exact same chance of getting a question right (say, 50%).
  • The Real View (Overdispersion): Some students are naturally "lucky" or "unlucky." In some families, the "girl probability" is naturally higher (maybe 55%), and in others, it's lower (maybe 40%).

The Discovery: The data showed way more "all-boy" and "all-girl" families than a simple coin flip would predict.

  • Simple Math: If you flip a fair coin 8 times, getting 8 heads is very rare.
  • Real Life: It happens more often than math predicts.
    Why? Because some families just have a "genetic tendency" toward one gender. It's not that the coin changes during the family; it's that every family has its own unique coin that is slightly different from its neighbor's. Hjort calls this Overdispersion—the data is "spread out" more than the simple model allows.

3. The "Streak" Effect (Markovian Children)

The Analogy: Imagine a basketball player on a hot streak. If they make a shot, they are slightly more likely to make the next one.
The Reality: Hjort wondered: Does having a boy make it slightly more likely to have another boy?
The Discovery: Yes, but only a tiny bit. If a family just had a girl, the chance of the next child being a girl goes up slightly (from 48.5% to maybe 49%).
The Catch: The data didn't tell him the order of the children (e.g., Girl-Boy-Girl), only the total count. So, he had to use a computer to simulate millions of possible birth orders to see if a "streak" model fit the data better than the "random family coin" model. It turned out the "streak" effect exists, but it's very subtle.

4. The Power of Big Numbers (Sample Size)

The Analogy: Finding a needle in a haystack.
The Reality: The paper spends a lot of time talking about Sample Size.

  • If you look at a small group of families (say, 500), the "all-boy" families might just look like random luck. You can't prove anything.
  • But if you look at 38,000 families (like in this study), those "all-boy" families stop looking like luck and start looking like a pattern.

Hjort explains that with small data, you might miss the truth. With huge data, you can detect tiny, almost invisible differences. It's like hearing a whisper in a quiet room vs. a whisper in a stadium. In a stadium (huge data), you need a very loud whisper to be heard, but once you hear it, you know it's real.

The "Royal Flush" Families

The paper notes something funny: There are more families with 8 girls or 8 boys than simple math predicts.

  • Simple Math: Predicts about 117 all-girl families.
  • Real Data: Found 161.
  • The Fix: When you account for the fact that some families are "girl-heavy" and some are "boy-heavy" (the Overdispersion), the math finally matches reality. The "Royal Flush" (all one gender) happens more often because some families are just naturally stacked that way.

Conclusion: Nature is Messy (and Interesting)

The paper concludes that while the world of babies looks like a simple coin toss at first glance, it's actually a complex mix of:

  1. A slightly biased coin (slightly more boys).
  2. Different coins for different families (some families lean toward girls, others toward boys).
  3. Tiny streaks (having a boy makes the next one slightly more likely to be a boy).

The Takeaway: We are all the result of a "hierarchical cascade" of coin tosses, but the coins aren't perfect, they aren't all the same, and they sometimes remember what they just flipped. And to see these tiny secrets, you need to look at a lot of data.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →