Scaling of learning time for high dimensional inputs

This paper presents a theoretical analysis demonstrating that learning time for high-dimensional inputs in Hebbian learning models scales supralinearly due to reduced learning gradients, revealing a fundamental limitation that constrains the optimal design of both artificial and biological neural networks.

Carlos Stein Brito

Published 2026-03-03
📖 5 min read🧠 Deep dive
⚕️

This is an AI-generated explanation of a preprint that has not been peer-reviewed. It is not medical advice. Do not make health decisions based on this content. Read full disclaimer

The Big Picture: The "Too Many Choices" Problem

Imagine you are trying to find a specific, hidden treasure in a giant room.

  • Low-dimensional room (Small N): The room is a small closet. You can see the corners clearly. If you start looking in a random spot, you are likely to be standing pretty close to the treasure. It's easy to find your way.
  • High-dimensional room (Large N): Now, imagine the room is a massive, multi-story warehouse with thousands of dimensions (up, down, left, right, forward, backward, and hundreds of other directions you can't even visualize).

The paper argues that as this "warehouse" gets bigger (more data inputs), finding the treasure becomes exponentially harder, not just a little bit harder. In fact, it gets so hard that learning becomes practically impossible within a human lifetime if the room is too big.

The Core Metaphor: The "Flat Fog" vs. The "Steep Hill"

To understand why this happens, we need to look at how the "learning" works. Think of learning as a hiker trying to climb a mountain to reach the peak (the correct answer).

  1. The Landscape of Mistakes:
    In a small room, if you start in the wrong place, you are usually on a steep slope. Gravity pulls you down quickly toward the bottom (the solution).
    In a huge, high-dimensional room, the landscape is different. Most of the space is filled with saddle points.

    • What is a saddle point? Imagine sitting on a horse saddle. If you lean forward or backward, you slide down. But if you lean left or right, you slide up. It's a flat spot in the middle of a chaotic landscape.
    • The Problem: In high dimensions, there are billions of these saddle points. If you start randomly (which computers do), you are almost guaranteed to land on one of these flat spots.
  2. The "Ghost" of the Treasure:
    The paper uses a concept called overlap. Imagine the "treasure" is a specific direction (a hidden feature).

    • In a small room, a random starting direction might be 45 degrees away from the treasure. That's a good start.
    • In a high-dimensional room, if you pick a random direction, it is almost 90 degrees (perpendicular) to the treasure. You are looking in a completely different direction.
    • The Analogy: Imagine trying to find a specific needle in a haystack. In a small haystack, you might be standing right next to it. In a giant haystack, if you pick a random spot, you are likely standing in a completely different country. You have to walk a long way just to get close enough to see the needle.

The "Silent Gradient" (Why it takes so long)

When you are far away from the treasure (low overlap), the "slope" that guides you toward it becomes incredibly flat.

  • The Gradient: This is the signal that tells the computer, "Go this way!"
  • The Issue: When you are almost perpendicular to the answer, the signal is so weak it's like trying to hear a whisper in a hurricane. The computer takes tiny, tiny steps because it doesn't know which way to go.
  • The Result: The time it takes to learn doesn't just go up linearly (1x, 2x, 3x). It goes up supralinearly.
    • Simple Math: If you double the number of inputs, the time it takes to learn doesn't double; it might quadruple or even increase by a factor of eight. It explodes.

Why Do We Have "Small" Brains and "Small" Receptive Fields?

You might wonder: If high dimensions are so bad, why do our brains and AI models (like those that recognize faces) work at all?

The paper suggests a brilliant evolutionary and engineering solution: Limit the view.

  • Biological Brains: A neuron in your brain doesn't look at the whole world at once. It has a "receptive field." It only looks at a tiny patch of your retina or a small group of inputs.
  • AI (Convolutional Neural Networks): When an AI looks at an image, it doesn't analyze the whole 1000x1000 pixel image with one giant neuron. It uses many small neurons, each looking at a tiny 3x3 patch.

The Takeaway: Nature and engineers didn't just "get lucky" with these designs. They are forced to do this because if a single neuron tried to process too many inputs at once, the learning time would become infinite. By breaking the problem into small, manageable chunks (small dimensions), the "saddle points" disappear, the slopes become steep, and learning happens quickly.

Summary in One Sentence

The paper proves that as the number of inputs to a learning system grows, the system gets lost in a vast, flat landscape of "almost-right" answers, making learning impossibly slow unless the system is designed to look at only a few inputs at a time.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →