WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

This paper introduces WaveSSM, a novel family of State-Space Models built on wavelet frames that leverage localized temporal support to outperform traditional polynomial-based SSMs like S4 in modeling non-stationary signals with transient dynamics, such as physiological data and raw audio.

Ruben Solozabal, Velibor Bojkovic, Hilal Alquabeh, Klea Ziu, Kentaro Inui, Martin Takac

Published 2026-02-27
📖 4 min read☕ Coffee break read

Imagine you are trying to understand a long, complex story told by a friend. Sometimes, the story has a slow, steady rhythm (like a calm description of a landscape). Other times, it has sudden, loud outbursts or quick, sharp changes (like a car crash or a sudden laugh).

For a long time, AI models trying to understand these "stories" (which are just sequences of data like audio, heartbeats, or text) have been using a specific tool: State-Space Models (SSMs). Think of these models as a super-efficient note-taker who can remember a whole book without running out of paper.

However, the current version of this note-taker has a flaw. They use a method called Polynomial Bases.

The Problem: The "Global" Note-Taker

Imagine your note-taker tries to summarize a story by writing one giant, smooth sentence that covers the entire book from start to finish.

  • The Issue: If your friend suddenly screams "Fire!" in the middle of a calm story, this note-taker smears that scream across the whole sentence. They can't easily point to exactly when the scream happened or isolate it from the rest of the story. They treat the whole timeline as one big, blurry blob.
  • The Result: They are great at smooth, slow stories, but terrible at spotting sudden, sharp events (transients) or non-stationary signals (things that change their nature over time).

The Solution: WaveSSM (The "Zoom Lens" Note-Taker)

The authors of this paper, WaveSSM, decided to give the note-taker a new tool: Wavelets.

Think of Wavelets as a Zoom Lens or a Flashlight.

  • Instead of writing one giant sentence for the whole book, the note-taker can now shine a flashlight on a specific paragraph, then zoom in on a specific sentence, then zoom in on a specific word.
  • They can capture the "big picture" (the slow rhythm) and the "tiny details" (the sudden scream) without mixing them up.

How It Works (The Analogy)

  1. Old Way (Polynomials): Imagine trying to describe a jagged mountain range using only smooth, rolling hills. You can get close, but you'll never capture the sharp peaks and deep valleys accurately. You need a lot of "smooth hills" to fake a sharp peak, and it's inefficient.
  2. New Way (Wavelets): Imagine using a set of different-sized building blocks. You use big blocks for the flat plains and tiny, sharp blocks for the mountain peaks. You can build the exact shape of the landscape with far fewer blocks, and you know exactly where the peak is.

In technical terms, WaveSSM breaks the signal down into localized atoms.

  • Global Support (Old): One piece of data affects the whole memory.
  • Local Support (New): One piece of data only affects a small, specific part of the memory. This allows the AI to "pay attention" to exactly where the interesting stuff is happening.

Why Does This Matter?

The paper tested this new "Zoom Lens" note-taker on real-world problems where things change quickly and unpredictably:

  1. Heartbeats (ECG): A heart rhythm is usually steady, but a heart attack looks like a sudden, sharp spike. WaveSSM spotted these spikes better than the old models, leading to more accurate medical diagnoses.
  2. Voice Commands: When you say "Hey Siri," there's a sudden burst of sound. WaveSSM understood these sharp audio bursts better, even when the voice was recorded at different speeds.
  3. Weather & Energy: Predicting the weather involves sudden storms. WaveSSM handled these sudden changes better than the previous "smooth hill" models.

The "Magic" Trick: Addressable Memory

The coolest part of WaveSSM is how it stores information.

  • Old Model: If you ask the model to remember two different events that happened at different times, it mashes them together into one big, confusing soup.
  • WaveSSM: Because it uses "local" blocks, it can store Event A in one corner of its memory and Event B in a completely different corner. It's like having a filing cabinet where every file has its own specific drawer. You can pull out just the "Fire" file without accidentally grabbing the "Calm Morning" file.

Summary

WaveSSM is like upgrading from a blurry, wide-angle lens to a high-definition camera with a zoom function. It allows AI to understand long sequences of data (like time series) much better, especially when that data has sudden, sharp, or changing moments. It's faster, more accurate, and much better at spotting the "needle in the haystack" than the models that came before it.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →