S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion

Imagine you are trying to teach a robot how to take the perfect photo of a busy city street at sunset. The street is full of moving cars, people walking, and the sun is blindingly bright in the sky while the shadows are pitch black.

To teach the robot, you need to show it thousands of examples of what a "perfect" photo looks like in these tricky situations. But here's the problem: taking real photos like this is a nightmare. You'd need special cameras, perfect weather, and you'd have to freeze time to get the "perfect" shot. It's too expensive, too slow, and often impossible to do perfectly.

This is where the paper "S2R-HDR" comes in. It's like a master chef who decided, "If we can't get enough real ingredients, let's build the best possible fake kitchen."

Here is the story of their solution, broken down into three simple parts:

1. The "Magic Kitchen": S2R-HDR (The Dataset)

Instead of going out to take 24,000 real photos, the researchers built a massive, hyper-realistic video game world using a powerful engine called Unreal Engine 5.

The Analogy: Think of this like a video game level designer. They didn't just build one room; they built 24,000 different scenes. They put in moving cars, running dogs, people walking, and even direct sunlight hitting a shiny car.
Why it's special: In the real world, if you want to see what a photo looks like with more light or less light, you have to take a new picture. In their "Magic Kitchen," they can instantly generate every possible version of a photo (bright, dark, super-bright) because they control the "sun" and the "camera" with a computer.
The Result: They created a library of 24,000 perfect training examples. This is huge! Previous libraries only had about 100 to 150 examples. It's like going from teaching a student with one textbook to giving them a whole library.

2. The "Translator": S2R-Adapter (The Domain Adaptation)

There is a catch. Even though their video game world looks amazing, it's still a simulation. A robot trained only on video games might look at a real tree and think, "That texture looks too smooth; it's fake." This is called the "Domain Gap."

The Analogy: Imagine you learned to drive in a driving simulator. You know the rules, but when you get in a real car, the steering feels different, and the tires make real noise. You might crash because the simulator didn't feel real enough.
The Solution: The researchers built a little "translator" tool called S2R-Adapter.
- The "Share Branch": This part remembers everything the robot learned in the video game (like "cars move fast," "sun is bright"). It makes sure the robot doesn't forget its training.
- The "Transfer Branch": This part is the translator. It teaches the robot, "Okay, real trees look a bit rougher, and real wind shakes the leaves differently."
The Magic: This tool allows the robot to take its "video game brain" and instantly upgrade it to handle "real world" photos without needing thousands of new real-world photos to relearn everything.

3. The "Self-Correcting GPS": Test-Time Adaptation

Sometimes, you don't even have the "answer key" (the perfect photo) to check if the robot is right. You just have the messy real-world photo.

The Analogy: Imagine you are driving in a foggy city you've never seen. You don't have a map. But, your car has a smart system that says, "Hey, that looks like a tree, but the fog is making it look weird. Let me adjust my view slightly to see if it's really a tree."
How it works: The researchers taught their system to look at a photo, guess what it sees, and then ask itself, "Am I confused?" If the system is confused (high uncertainty), it automatically tweaks its "Translator" (the Adapter) to fit that specific scene better, right on the spot.

The Grand Finale: Why Does This Matter?

The researchers tested their robot on real-world photos.

Without their help: Other robots (trained on small, old datasets) made mistakes. They left "ghosts" behind moving cars or blew out the bright sun, turning it into a white blob.
With S2R-HDR: Their robot produced crystal-clear photos. It handled moving cars without ghosts and recovered details in the blinding sun that other robots missed.

In a nutshell:
They realized that taking real photos for training is too hard, so they built a giant, perfect video game world to teach the AI. Then, they built a smart translator to help the AI understand that the real world is a little different from the game. The result? A camera system that can take amazing photos in any crazy, moving, bright, or dark situation, even if it was trained mostly on a computer.

1. Problem Statement

High Dynamic Range (HDR) fusion is critical for applications like computational photography, autonomous driving, and visual perception. However, the generalization of learning-based HDR fusion models is severely limited by the scarcity of large-scale, high-quality training data.

Data Acquisition Challenges: Collecting real-world HDR data for dynamic scenes is costly, time-consuming, and technically difficult. It requires precise control over lighting, weather, and motion, which is nearly impossible in uncontrolled environments (e.g., direct sunlight, moving vehicles/animals).
Limitations of Existing Datasets: Current real-world datasets (e.g., Kalantari et al., SCT, Challenge123) are small-scale (typically <150 samples), often restricted to simple human motion, and lack diversity in lighting and scene complexity. This leads to model overfitting and poor performance in challenging scenarios like large motion or extreme dynamic ranges.
Synthetic Data Gaps: While synthetic datasets exist, they often suffer from a significant domain gap (texture distribution, lighting realism) compared to real data, causing models trained on them to fail when deployed in real-world settings.

2. Methodology

The authors propose a two-pronged solution: a massive synthetic dataset (S2R-HDR) and a novel domain adaptation framework (S2R-Adapter).

A. S2R-HDR Dataset

Scale & Generation: The dataset contains 24,000 HDR samples (1,000 sequences × 24 frames), roughly 166 times larger than typical datasets. It is generated using Unreal Engine 5 (UE5).
Rendering Pipeline:
- Linear HDR Output: A custom pipeline modifies UE5's tone mapping and gamma correction to output data in linear HDR space, stored in floating-point EXR format to prevent quantization and allow for flexible data augmentation (e.g., generating different LDR exposures).
- Realism Enhancements: The pipeline simulates handheld camera shake (vibrations) and imperfections to mimic real-world capture conditions.
- Diversity: It covers diverse environments (indoor/outdoor), lighting conditions (day, dusk, night, direct sunlight), and dynamic elements (humans, animals, vehicles).
Ground Truth: Unlike real-world captures which often lack perfect ground truth, the synthetic nature of S2R-HDR provides pixel-perfect HDR ground truth.

B. S2R-Adapter (Domain Adaptation)

To bridge the gap between the synthetic S2R-HDR and real-world data, the authors introduce S2R-Adapter, a plug-and-play, parameter-efficient fine-tuning (PEFT) approach.

Architecture: The adapter consists of two parallel branches inserted into pre-trained layers (Linear or Conv):
1. Share Branch (Low-Rank): Uses a low-rank adapter ( $r_s \ll \min(h_{in}, h_{out})$ ) to preserve shared knowledge learned from the synthetic dataset, preventing catastrophic forgetting.
2. Transfer Branch (High-Rank): Uses a high-rank adapter ( $r_t \ge \max(h_{in}, h_{out})$ ) to learn domain-specific knowledge from the target real data.
Training Strategies:
- Supervised Adaptation: When ground truth is available, the model is fine-tuned on real data while learning scale factors ( $\alpha_s, \alpha_t$ ) to balance the two branches.
- Test-Time Adaptation (TTA) / Unsupervised: When no ground truth is available, the authors use a Mean-Teacher framework. They dynamically adjust the scale factors based on model uncertainty (measured via variance of augmented inputs). High uncertainty (large domain shift) increases the weight of the Transfer Branch; low uncertainty favors the Share Branch.

3. Key Contributions

S2R-HDR Dataset: The first large-scale (24k samples), high-quality synthetic HDR dataset designed specifically for fusion tasks. It offers unprecedented diversity in motion, lighting, and scene types, addressing the data scarcity bottleneck.
S2R-Adapter: A novel domain adaptation method that effectively mitigates the synthetic-to-real gap without requiring full retraining. It uniquely balances knowledge retention (Share) and domain adaptation (Transfer).
Test-Time Adaptation Framework: A robust method for adapting models to unseen real-world data without ground truth labels, utilizing uncertainty-guided scaling.
State-of-the-Art Performance: Demonstrated that models trained on S2R-HDR and adapted via S2R-Adapter outperform models trained directly on existing real-world datasets.

4. Experimental Results

The authors evaluated their approach on two standard real-world benchmarks: SCT (Tel et al., 2023) and Challenge123 (Kong et al., 2024), using both CNN-based (SAFNet) and Transformer-based (SCTNet, HDR-Transformer) baselines.

Supervised Results (with Ground Truth):
- Models trained on S2R-HDR and adapted with S2R-Adapter achieved State-of-the-Art (SOTA) results.
- On Challenge123, the method achieved a significant ~2 dB improvement in PSNR-µ over baselines trained directly on the dataset.
- Visual results showed superior reduction of ghosting artifacts in large motion and better recovery of highlights in direct sunlight compared to previous methods.
Unsupervised Results (Test-Time Adaptation):
- Without ground truth labels, the S2R-Adapter framework allowed models pre-trained on S2R-HDR to generalize effectively to unseen real datasets.
- On the SCT dataset, the approach improved PSNR-µ by 1.1 dB and PSNR-ℓ by 8.46 dB compared to the best baseline trained on real data.
Ablation Studies:
- Dataset Effectiveness: Models trained solely on S2R-HDR showed superior cross-dataset generalization compared to those trained on small real datasets.
- Adapter Effectiveness: Both the Share and Transfer branches were essential. Using both with learned scale factors yielded optimal performance.
- Knowledge Control: S2R-Adapter significantly reduced "catastrophic forgetting" compared to standard fine-tuning, preserving the model's ability to handle the synthetic data's complex dynamics while adapting to real textures.

5. Significance

Solving Data Scarcity: S2R-HDR provides a viable path for training robust HDR models in scenarios where collecting real data is impossible (e.g., extreme lighting, dangerous environments).
Generalization: The work demonstrates that high-quality synthetic data, when paired with effective domain adaptation, can outperform models trained on limited real-world data.
Practical Deployment: The Test-Time Adaptation capability allows HDR systems to adapt to new, unlabeled environments on the fly, making the technology more deployable in real-world autonomous systems and mobile photography.
Resource Availability: The authors have open-sourced the dataset, code, and rendering pipeline, fostering further research in HDR imaging and domain adaptation.

In conclusion, S2R-HDR and S2R-Adapter represent a paradigm shift in HDR fusion, moving from reliance on scarce real-world data to a scalable, synthetic-first approach enhanced by intelligent domain adaptation.

S2R-HDR: A Large-Scale Rendered Dataset for HDR Fusion

1. The "Magic Kitchen": S2R-HDR (The Dataset)

2. The "Translator": S2R-Adapter (The Domain Adaptation)

3. The "Self-Correcting GPS": Test-Time Adaptation

The Grand Finale: Why Does This Matter?

1. Problem Statement

2. Methodology

A. S2R-HDR Dataset

B. S2R-Adapter (Domain Adaptation)

3. Key Contributions

4. Experimental Results

5. Significance

More like this

VerifAI: A Verifiable Open-Source Search Engine for Biomedical Question Answering

Unbiased Rectification for Sequential Recommender Systems Under Fake Orders

Self-Sovereign Agent

Automated Standardization of Legacy Biomedical Metadata Using an Ontology-Constrained LLM Agent

Multi-Agent Home Energy Management Assistant