Concentration of the largest induced tree size of Gn,pG_{n,p} around the standard expectation threshold

This paper extends the known concentration of the largest induced tree size in random graphs Gn,pG_{n,p} to all vanishing edge probabilities pn1/2ln3/2np \gg n^{-1/2} \ln^{3/2} n and demonstrates that for sparser graphs where n1pn1/2n^{-1} \ll p \ll n^{-1/2}, this size fails to concentrate around the standard expectation threshold.

Jakob Hofstad

Published Tue, 10 Ma
📖 6 min read🧠 Deep dive

Imagine you are a city planner looking at a massive, randomly generated map of a city called Gn,pG_{n,p}.

In this city:

  • There are nn people (vertices).
  • Every pair of people has a chance to become friends (an edge). The chance is pp.
  • If pp is high, everyone is friends with everyone. If pp is low, friendships are rare.

Your goal is to find the largest "Tree Party" in this city.
A Tree Party is a group of people who are all connected to each other in a specific way:

  1. They form a single, connected group (no one is left out).
  2. There are no loops (if A is friends with B, and B with C, A cannot be friends with C, or else it's a circle, not a tree).
  3. Crucially: No one outside this party is friends with exactly one person inside the party. (If they were, that outsider could just join the party and make it bigger, so it wouldn't be the "largest" possible tree).

The paper asks: How big is the biggest Tree Party we can expect to find? And, more importantly, is the size of this party always very predictable, or does it jump around wildly?

The Big Question: Is the Size Predictable?

In math, we say a number is "concentrated" if, no matter how you roll the dice to build the city, the answer is almost always one of two very close numbers (like 100 or 101). It doesn't jump from 100 to 200.

For a long time, mathematicians knew that if the city is very "dense" (lots of friendships, high pp), the size of the Tree Party is very predictable. It's always either kk or k+1k+1.

What this paper does:
The author, Jakob Hofstad, pushes the boundaries. He asks: "What happens if the city gets very sparse? What if friendships are rare?"

He finds two main results:

1. The "Sweet Spot" (When pp is small, but not too small)

If the friendship probability pp is small, but still bigger than roughly $1/\sqrt{n}$ (think of it as a city where everyone has a few friends, but not a crowd), the answer is still predictable.

  • The Analogy: Imagine you are looking for the biggest circle of friends in a large town. Even if the town is quiet, you will almost always find a circle that is either 50 people or 51 people. It never jumps to 40 or 60.
  • The Math: The paper proves that for a wide range of sparse cities, the size of the largest tree is "concentrated" on two values. It's a very stable number.

2. The "Chaos Zone" (When pp is very, very small)

However, if the city gets extremely sparse (friendships are so rare that pp is less than $1/\sqrt{n}$), the predictability breaks down.

  • The Analogy: Imagine a ghost town where people barely know each other. In this scenario, the size of the "Tree Party" becomes unpredictable. It doesn't settle on 50 or 51. Instead, it might be 40 in one version of the city, and 65 in another.
  • The "Expectation Threshold": Usually, mathematicians guess the size of the biggest tree by calculating the "average" (expectation). They say, "On average, we expect a tree of size kk."
    • In the "Sweet Spot," the actual biggest tree is very close to this average.
    • In the "Chaos Zone," the actual biggest tree is much smaller than the average suggests. The "average" is a lie because the rare, huge trees that pull the average up simply don't exist in the real world when the city is too sparse.

How Did They Prove This? (The Detective Work)

To solve this, the author used three different "counting tools" (random variables):

  1. Tool X (The Simple Counter): Just counts every possible tree.

    • Result: Good for finding the upper limit (the maximum possible size), but it overestimates things because it counts trees that aren't "maximal" (trees that could easily grow bigger).
  2. Tool Y (The Strict Counter): Counts trees where everyone outside the tree has at least 3 friends inside the tree.

    • Why? If someone outside has only 1 or 2 friends inside, they might be able to join. If they have 3, they are "locked out."
    • Result: This tool worked perfectly for the "Sweet Spot." It proved that the tree size is stable and predictable.
  3. Tool W (The Maximal Counter): Counts trees where no one outside has exactly 1 friend inside.

    • Result: This tool was used for the "Chaos Zone." It showed that when the city is too sparse, the "Maximal" trees are actually much smaller than the simple average predicted. The "drift" (the gap between the average and reality) becomes huge.

The "Bottleneck" Explained Simply

The paper identifies a specific "tipping point" in the density of the city.

  • Above the line: The city is dense enough that the "Strict Counter" (Tool Y) works. The math holds up, and the tree size is stable.
  • Below the line: The city is too sparse. The "Strict Counter" fails because the "drift" in the math becomes too large. The tree size becomes erratic and smaller than expected.

Why Does This Matter?

This isn't just about trees in graphs. It's about understanding complex systems (like social networks, the internet, or biological networks).

  • Stability: It tells us when a system is stable and predictable (you can rely on the average).
  • Instability: It tells us when a system is fragile and the "average" is a dangerous lie.
  • The Method: The author shows that to understand very sparse networks, you can't just count the obvious things (like independent trees). You have to look for "clusters" or "almost-independent" groups, a technique that is much harder but necessary for the "Chaos Zone."

In summary: The paper maps out exactly how sparse a network can get before the size of its largest "tree-like" structure stops being predictable and starts behaving erratically. It confirms that for a surprisingly wide range of sparse networks, things are still orderly, but once you cross a specific threshold, chaos takes over.