Activation Function Design Sustains Plasticity in Continual Learning

This paper demonstrates that thoughtful activation function design, specifically through the introduction of Smooth-Leaky and Randomized Smooth-Leaky nonlinearities, serves as a lightweight, architecture-agnostic solution to sustain model plasticity and prevent adaptation loss in continual learning scenarios without requiring additional capacity or task-specific tuning.

Lute Lillo, Nick Cheney

Published 2026-03-02
📖 4 min read☕ Coffee break read

Imagine you are training a dog to do tricks. First, you teach it to sit. Then, you teach it to shake hands. Finally, you teach it to roll over.

In the world of Artificial Intelligence (AI), this is called Continual Learning. The goal is for the AI to keep learning new things without forgetting the old tricks.

However, AI has a problem. Sometimes, after learning a few new tricks, the AI gets "stuck." It remembers the old tricks perfectly, but it loses the ability to learn new ones. It becomes rigid, like a statue that can't move its joints. Scientists call this "Loss of Plasticity."

This paper argues that the secret to keeping an AI flexible isn't just about giving it more brain power or better training methods. It's about changing the activation function.

What is an "Activation Function"?

Think of an AI as a massive team of tiny workers (neurons) passing notes to each other.

  • The Input: A note arrives at a worker's desk.
  • The Activation Function: This is the worker's decision rule. It decides: "Do I pass this note along? Do I shout it out? Or do I throw it in the trash?"

If the decision rule is too strict, the worker throws away too many notes (the AI stops learning). If the rule is too chaotic, the worker shouts everything (the AI gets confused).

The Problem: The "Dead Zone"

The most common decision rule used in AI today is called ReLU. Imagine a worker who says:

"If the note is positive (good news), I'll pass it on. If the note is negative (bad news), I'll throw it in the trash and never look at it again."

In a stable world, this works fine. But in a changing world (Continual Learning), things get tricky. Sometimes, the "bad news" (negative numbers) actually contains the key to learning a new trick. If the worker throws it away, the AI loses that information forever. The worker becomes a "dead unit"—a zombie neuron that never fires again. The AI's brain fills up with these zombies, and it can't learn anything new.

The Solution: The "Goldilocks" Zone

The authors of this paper discovered that the best decision rule isn't "all or nothing." It needs to be just right.

They found a "Goldilocks Zone" for how workers should handle bad news:

  1. Don't throw it away completely: Even if the note is negative, the worker should still pass a tiny version of it along. This keeps the worker "alive" and ready to learn.
  2. Don't scream it too loud: If the worker passes the negative note too loudly, it causes chaos and instability.
  3. Be smooth: The transition from "passing good news" to "passing bad news" should be a smooth curve, not a sharp, jagged cliff.

The New Tools: Smooth-Leaky & Randomized Smooth-Leaky

Based on this, the authors invented two new "decision rules" (activation functions):

  1. Smooth-Leaky: Imagine a worker who usually passes good news, but when bad news comes, they don't throw it away. Instead, they gently leak a little bit of it through a small crack in the door. This keeps the door from jamming shut.
  2. Randomized Smooth-Leaky: This is like having a team of workers where, every time a note arrives, they randomly decide how much of the bad news to leak. Sometimes a little, sometimes a bit more. This randomness keeps the team on their toes and prevents them from getting stuck in a rut.

Why Does This Matter?

The authors tested these new rules in two very different worlds:

  • The Classroom (Supervised Learning): Teaching the AI to recognize different types of images one after another.
  • The Video Game (Reinforcement Learning): Teaching an AI to walk, run, and jump in a physics simulation that changes over time.

The Result?
In both cases, the AI using the new "Smooth-Leaky" rules kept learning new tricks for much longer. They didn't get "stuck" or forget how to adapt. They remained flexible, like a gymnast, rather than rigid like a statue.

The Big Takeaway

For a long time, scientists thought the way to fix AI learning problems was to build bigger brains or use smarter training algorithms. This paper says: "Stop overcomplicating it."

Sometimes, the solution is as simple as changing the personality of the neurons. By making them slightly more open to "bad news" (negative inputs) and keeping their decision-making process smooth, we can keep AI flexible and ready to learn forever.

In short: To keep an AI young and adaptable, don't let its neurons go to sleep. Give them a gentle nudge to keep working, even when things get tough.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →