Bayesian Modeling of Collatz Stopping Times: A Probabilistic Machine Learning Perspective

This paper applies a probabilistic machine learning framework to analyze Collatz stopping times up to $10^7$, demonstrating that a Bayesian hierarchical Negative Binomial regression outperforms a mechanistic odd-block generator in predictive likelihood while revealing that low-order modular structure significantly drives the observed heterogeneity.

Nicolò Bonacorsi, Matteo Bordoni

Published 2026-03-06
📖 5 min read🧠 Deep dive

Here is an explanation of the paper, translated into everyday language with some creative metaphors.

The Big Picture: The Collatz Game

Imagine a simple game played with numbers. You pick a number, and you follow two rules:

  1. If it's even, cut it in half.
  2. If it's odd, triple it and add one.

You keep doing this until you reach the number 1. The "stopping time" is simply how many steps it takes to get there.

The big mystery (the Collatz Conjecture) is whether every number eventually reaches 1. Nobody has proven it yet. But this paper doesn't try to prove the math; instead, it asks: "If we treat these numbers like a random crowd, what does the 'time to finish' look like, and can we predict it?"

The authors looked at the first 10 million numbers and tried to build two different "crystal balls" to predict how long the game lasts for any given number.


The Problem: It's Messy and Unpredictable

If you plot the stopping times for 10 million numbers, it looks like a chaotic mess.

  • The "Long Tail": Most numbers finish quickly, but a few take forever. It's like a race where most people finish in 10 minutes, but a few run for 10 hours.
  • The "Stripes": When you look closely, the numbers aren't random. They form invisible "stripes" or bands. Some numbers always take longer than others just because of their specific "remainder" when divided by 8 (like how some days of the week are busier than others).

The authors wanted to build models to explain this mess.


Model 1: The "Statistical Weather Forecaster" (Bayesian Regression)

The Analogy: Imagine you are trying to predict how long a commute will take. You know that traffic is usually worse at 5 PM (time of day) and worse on rainy days (weather). You don't need to know the physics of every car engine; you just need the patterns.

How it works:
The authors built a Bayesian Negative Binomial Regression. That's a fancy way of saying: "Let's use a statistical formula that accounts for messy, unpredictable data."

  • The Inputs: They fed the model two simple clues:
    1. Log(n): How big the number is (bigger numbers generally take longer, but not in a straight line).
    2. n mod 8: The remainder when you divide the number by 8. This captures those "stripes" we saw earlier.
  • The Magic: The model doesn't just give one answer. It gives a range of probabilities. It says, "For this number, there's a 90% chance the game lasts between 100 and 200 steps."
  • The Result: This model was the champion. It predicted the stopping times better than anything else. It admitted, "I'm not perfect, but I'm very good at guessing the average and the uncertainty."

Model 2: The "Mechanical Toy" (Odd-Block Generator)

The Analogy: Imagine a Rube Goldberg machine. Instead of guessing the outcome, you try to rebuild the machine itself. You know the gears turn, the balls drop, and the levers flip. You want to simulate the machine step-by-step to see where the ball lands.

How it works:
The Collatz game has a hidden rhythm. When you hit an odd number, you do 3n + 1, which makes it even. Then you divide by 2 repeatedly until it's odd again.

  • The authors realized this is like a block of steps. You jump from one odd number to the next, and the "distance" of the jump depends on how many times you have to divide by 2.
  • The Twist: Instead of calculating the exact math for every step (which is slow), they turned it into a dice game. They said, "Let's pretend the length of these jumps is random, but follows a specific pattern."
  • The Refinement: At first, they just rolled a standard die (a geometric distribution). It was okay, but not great. Then, they realized the "stripes" mattered. So, they made the dice conditional. If the number is a certain "type" (based on the mod 8 rule), they used a different weighted die.
  • The Result: This model is mechanically faithful. It explains why the game behaves the way it does. However, as a pure predictor, it was less accurate than the statistical "Weather Forecaster." It was like a detailed simulation of a car engine that was slightly less accurate at predicting traffic than a simple app that just looks at the time of day.

The Showdown: Which is Better?

The authors put the two models head-to-head on a test set of numbers they hadn't seen before.

  1. The Statistical Model (Weather Forecaster): Won easily. It predicted the exact number of steps with the highest accuracy. It's the best tool if you just want to know "How long will this take?"
  2. The Mechanical Model (Rube Goldberg Machine): Lost on pure accuracy, but won on understanding. It showed us that the "stripes" (the mod 8 rule) are real and important. It proved that the randomness isn't just noise; it's structured by the number's shape.

The Takeaway

The paper teaches us two things:

  1. Sometimes, simple statistics beat complex simulations. If you just want to predict the outcome, a smart statistical model that learns from the data is often better than trying to simulate every single rule of the universe.
  2. Structure hides in the noise. Even in a chaotic system like the Collatz game, there are hidden patterns (like the mod 8 stripes) that drive the behavior. By combining the statistical power with the mechanical understanding, we get the full picture.

In short: The authors didn't solve the Collatz Conjecture, but they built a very good map of the territory, showing us where the mountains are and how the rivers flow, using both a satellite view (statistics) and a ground-level tour (mechanics).