UniRain: Unified Image Deraining with RAG-based Dataset Distillation and Multi-objective Reweighted Optimization

This paper proposes UniRain, a unified image deraining framework that combines a RAG-based dataset distillation pipeline for selecting high-quality training samples and a multi-objective reweighted optimization strategy within an asymmetric MoE architecture to effectively restore images degraded by diverse rain streaks and raindrops across both daytime and nighttime conditions.

Qianfeng Yang, Qiyuan Guan, Xiang Chen, Jiyu Jin, Guiyue Jin, Jiangxin Dong

Published 2026-03-05
📖 5 min read🧠 Deep dive

Imagine you are trying to clean a window that has been hit by a storm. Sometimes it's just a few long streaks of rain (like on a sunny day). Sometimes it's a messy splash of droplets (like on a car windshield). Sometimes it's dark and foggy (nighttime), and sometimes the rain is mixed with snow or haze.

For years, computer scientists built "cleaning robots" that were experts at only one of these specific problems. If you had a robot trained only on daytime streaks, it would fail miserably if you showed it a night scene with raindrops. You'd need a different robot for every single weather scenario, which is slow, expensive, and clumsy.

The paper you shared introduces UniRain, a new "Super Cleaning Robot" that can handle all these messy weather conditions at once. Here is how it works, explained through simple analogies:

1. The Problem: The "Bad Data" Buffet

Imagine you want to teach a chef how to cook a perfect steak.

  • The Old Way: You give the chef a giant buffet containing 2,000,000 plates of food. Some are perfect steaks, but many are burnt, raw, or covered in dirt. Because the buffet is so huge but so messy, the chef gets confused. They might learn to cook the easy stuff (burnt toast) perfectly but fail at the hard stuff (perfect steak) because the bad data distracts them.
  • The UniRain Solution: Instead of feeding the chef the whole messy buffet, UniRain uses a Smart Filter (RAG-based Distillation).
    • Think of this filter as a team of expert food critics (AI models) who look at every single plate in the buffet.
    • They ask: "Is this a real, high-quality rainy scene? Or is it a fake, blurry, or low-quality mess?"
    • They throw away the bad plates and keep only the best, most realistic samples.
    • Result: The chef (the AI model) now learns from a small, high-quality menu instead of a massive, confusing pile of junk. This makes the chef much smarter and more adaptable.

2. The Training: The "Fair Coach"

Now that the chef has good ingredients, they need to learn how to cook them. But here's the catch: some dishes are easy to learn (like frying an egg), while others are very hard (like making a soufflé).

  • The Problem: If you use the same training schedule for everything, the chef gets bored with the easy stuff and stops trying to learn the hard stuff. They become great at eggs but terrible at soufflés.
  • The UniRain Solution: They use a Multi-Objective Reweighted Optimization strategy.
    • Imagine a coach who watches the chef's progress.
    • If the chef is getting really good at "Daytime Rain" (the easy task) too fast, the coach says, "Okay, you're doing great, let's slow down on that and focus more energy on 'Nighttime Raindrops' (the hard task)."
    • The coach constantly adjusts the difficulty and attention, ensuring the chef doesn't get lazy on the hard stuff or frustrated on the easy stuff. This keeps the learning balanced and strong across all scenarios.

3. The Brain: The "Specialized Team" (MoE)

Finally, the actual cleaning robot needs a brain architecture that can handle different types of mess.

  • The Old Way: Using one giant brain to try to do everything at once. It's like asking one person to be a painter, a sculptor, and a architect simultaneously. They might get overwhelmed.
  • The UniRain Solution: They built an Asymmetric Mixture-of-Experts (MoE) system.
    • The Encoder (The "Scanner"): This part uses a Soft-MoE. Imagine a team of detectives who all look at the crime scene (the rainy image) and share their thoughts gently. They all contribute a little bit to understand the general vibe of the rain.
    • The Decoder (The "Fixer"): This part uses a Hard-MoE. Imagine a team of specialized repairmen. When they see a specific type of damage (like a big raindrop), they don't ask everyone to help. Instead, they instantly pick the top 1 or 2 experts who are best at fixing that specific problem and let them do the heavy lifting.
    • Result: The robot is flexible enough to understand the whole scene but efficient enough to call in the exact specialist needed to fix the specific problem.

Why This Matters

Before UniRain, if you wanted to clean a rainy video for a self-driving car, you might need to switch between different software models depending on whether it was day or night, or if it was raining or snowing.

UniRain is the "Swiss Army Knife" of image cleaning.

  1. It filters out the junk from the internet to learn only from the best examples.
  2. It balances its training so it doesn't ignore difficult weather conditions.
  3. It uses a smart team of specialists to fix the image efficiently.

The result? A single model that can take a messy, rainy, dark, or snowy photo and make it look crystal clear, outperforming all the previous "specialized" robots. It's like having one master cleaner who can handle any weather, anywhere, anytime.