BriMA: Bridged Modality Adaptation for Multi-Modal Continual Action Quality Assessment

This paper introduces BriMA, a novel framework for multi-modal continual Action Quality Assessment that addresses real-world modality missingness through a memory-guided bridging imputation module and a modality-aware replay mechanism, achieving significant performance improvements across multiple datasets.

Kanglei Zhou, Chang Li, Qingyi Pan, Liyuan Wang

Published 2026-02-24
📖 4 min read☕ Coffee break read

Imagine you are a judge at a gymnastics competition. Your job is to watch a routine and give it a score based on how well the athlete performs. To do this perfectly, you usually rely on three things:

  1. Video: Seeing the athlete move.
  2. Audio: Hearing the music and the rhythm of their movements.
  3. Text/Notes: Reading the official rules or commentary.

In a perfect world, you always have all three. But in the real world, things go wrong. Sometimes the camera glitches (no video), the microphone fails (no audio), or the notes get lost (no text). This is called Modality Imbalance.

Even worse, these problems aren't static. Today the camera works but the audio is bad; tomorrow the audio is fine but the camera is broken. This is Non-Stationary Imbalance.

The Problem: Why Old Judges Fail

Existing computer programs (AI) trained to judge these routines are like students who only studied for a test using a perfect textbook.

  • The "Forgetting" Problem: When you teach an AI a new routine (Task 2), it often forgets how to judge the old routine (Task 1). This is called "Catastrophic Forgetting."
  • The "Missing Data" Problem: If you feed the AI a video with no audio, it panics. It tries to guess the missing audio, but it usually guesses wrong, leading to a terrible score.
  • The "Drift" Problem: As the AI learns new things, its internal "rules" for scoring shift. A routine that used to get a 15 might suddenly get a 12 just because the AI's perspective changed, even if the performance was the same.

The Solution: BriMA (The Smart, Adaptable Judge)

The paper introduces BriMA (Bridged Modality Adaptation). Think of BriMA not as a rigid robot, but as a seasoned, adaptable coach who has a special toolkit to handle chaos.

BriMA uses two main tricks to stay calm and accurate:

1. The "Memory Bridge" (Filling in the Gaps)

Imagine you are trying to guess what a song sounds like, but the audio track is missing.

  • Old AI: Tries to invent a completely new song from scratch. It often sounds nothing like the original.
  • BriMA: It looks at its Memory Bank. It says, "I remember a similar routine from last week where the audio was missing. I know what the music should sound like based on the video."
  • The Magic: Instead of inventing the whole song, BriMA only calculates the tiny difference (the "residual") needed to fix the missing piece. It's like a carpenter who doesn't rebuild the whole table when a leg is broken; they just carve a perfect new leg to fit the existing table. This keeps the score accurate and consistent.

2. The "Smart Replay" (Learning Without Forgetting)

When a student studies for a new exam, they shouldn't just throw away their old notes.

  • Old AI: Replays old data randomly, like flipping through a textbook at random pages. It might waste time on easy examples and miss the hard ones.
  • BriMA: It acts like a strict tutor. It looks at its memory bank and asks:
    • "Which old examples did I get wrong because the data was messy?"
    • "Which examples am I likely to forget?"
    • "Which examples are most important to keep the scoring rules stable?"
  • It then prioritizes these specific examples for review. It's like a teacher saying, "We aren't reviewing the whole chapter; we are only reviewing the three problems you keep getting wrong." This ensures the AI doesn't forget the past while learning the future.

Why This Matters

The researchers tested BriMA on three different datasets (Rhythmic Gymnastics, Figure Skating, and a large skating dataset). They simulated a world where sensors fail randomly (10%, 25%, or even 50% of the time).

The Results:

  • Higher Accuracy: BriMA scored much closer to human judges than any other method, even when half the data was missing.
  • Less Forgetting: It remembered how to judge old routines while learning new ones.
  • Real-World Ready: This isn't just a lab experiment. It means we can build AI systems for sports, rehabilitation, and skill training that won't crash just because a camera flickers or a sensor disconnects.

The Bottom Line

BriMA is the difference between a robot that breaks when the lights go out, and a human expert who can still judge the performance by feeling the rhythm and remembering past experiences. It bridges the gap between "perfect data" and "messy reality," making AI scoring reliable for the real world.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →