A Rigorous, Tractable Measure of Model Complexity

This paper introduces a rigorous and computationally efficient measure of model complexity based on input gradient similarities that unifies various existing metrics and provides new insights into the double descent phenomenon across diverse model architectures.

Original authors: Oskar Allerbo, Thomas B. Schön

Published 2026-05-21✓ Author reviewed
📖 5 min read🧠 Deep dive

Original authors: Oskar Allerbo, Thomas B. Schön

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Big Problem: How "Complicated" is Your Model?

Imagine you are a chef trying to judge how complex a recipe is.

  • The Old Way: You might just count the number of ingredients (parameters). But a recipe with 50 spices might actually be a simple dish if all the spices taste the same. Conversely, a recipe with only 3 ingredients could be incredibly complex if the chef has to juggle them in a very specific, delicate way.
  • The Current Mess: In machine learning, scientists have tried to measure "complexity" using things like the number of parameters, the "Vapnik-Chervonenkis dimension" (a very hard math concept), or "effective degrees of freedom." The problem is that these methods are either too rough (like just counting ingredients) or so hard to calculate that they are useless in practice.

The authors of this paper, Oskar Allerbo and Thomas B. Schön, want to fix this. They propose a new, easy-to-calculate, and mathematically solid way to measure complexity called Gradient Alignment Complexity (GAC).

The New Idea: The "Dance Floor" Analogy

To understand GAC, imagine the model is a dancer, and the "gradients" are the directions the dancer is facing when they move.

  • The Setup: The model looks at different inputs (different songs on the dance floor). For every song, the model has a specific "direction" it wants to move in to learn the data.
  • Simple Model (Low Complexity): If the model is very simple, it reacts to every song in the exact same way. It faces the same direction no matter what music plays. All its "dance moves" are perfectly aligned. It has very little freedom.
    • Analogy: A robot that only knows one dance move. No matter the song, it does the same thing. It's simple, but not very flexible.
  • Complex Model (High Complexity): If the model is very complex, it reacts differently to every song. For one song, it faces North; for another, it faces South; for a third, it spins wildly. Its "dance moves" are all over the place and point in totally different directions.
    • Analogy: A jazz improviser who changes their style completely for every note. They have total freedom to move anywhere.

The GAC Measure: The authors simply measure how much these "dance moves" (gradients) align with each other.

  • If they all point the same way (high alignment) \rightarrow Low Complexity.
  • If they point in random, independent directions (low alignment) \rightarrow High Complexity.

Why This is a Big Deal

The paper claims this new measure is special for three main reasons:

  1. It Works for Everyone: Whether you are using a simple polynomial equation, a decision tree, a random forest, or a neural network, this measure works. It doesn't care what "flavor" of model you are using.
  2. It Measures the "Machine," Not Just the "Output": Sometimes a complex machine (like a super-computer) is used to do a very simple task (like adding 2+2). Old measures might say the machine is simple because the result is simple. The GAC looks at the machine itself. It says, "Hey, even though you're doing a simple task right now, you have the potential to do very complex things because your internal parts are so flexible."
  3. It Generalizes Old Rules: The authors prove that their new measure naturally turns into the old, familiar rules when you apply them to specific models:
    • For Polynomials, it acts like the "degree" (how high the power goes).
    • For Decision Trees, it acts like the "number of splits" (how many branches).
    • For Random Forests, it acts like the "number of trees."
    • For K-Nearest Neighbors, it acts like the "number of neighbors."

Solving the "Double Descent" Mystery

There is a famous phenomenon in AI called Double Descent. Usually, as you make a model more complex, it gets better at learning, then worse (overfitting), and then—surprisingly—gets better again if you make it even more complex.

Scientists have been arguing about why this happens. Some say it's because the model is getting too big; others say it's an illusion caused by how we measure complexity.

The authors used their new GAC measure to re-test these experiments:

  • For "Static" Models: (Models where the structure doesn't change during training, like Random Forests or Random Fourier Features). The GAC confirmed that Double Descent is real. As you add more trees or features, the complexity goes up, and the "second descent" (getting better again) happens exactly when the complexity hits a certain point.
  • For "Dynamic" Models: (Models like Neural Networks where the features change as they learn). The authors found that Double Descent often disappears when measured with GAC. Why? Because as these models get bigger, they actually become less complex in terms of how they align their gradients. They learn to adapt so well that they stop using their full "complexity potential."

The Takeaway

The authors have built a new "ruler" for measuring machine learning models.

  • Old Rulers: Were either too blunt (counting parts) or too hard to use (requiring impossible math).
  • The New GAC Ruler: Looks at how the model's internal "muscles" (gradients) move together. If they move in lockstep, the model is simple. If they move independently, the model is complex.

This tool helps scientists understand why models behave the way they do, particularly the confusing "Double Descent" curve, by providing a clear, consistent definition of what "complexity" actually means across different types of AI.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →