Real-Time Long Horizon Air Quality Forecasting via Group-Relative Policy Optimization

This paper introduces a real-time long-horizon air quality forecasting framework for East Asia that combines a newly released high-resolution dataset with Group-Relative Policy Optimization (GRPO) to significantly reduce false alarm rates and align predictions with operational public health priorities.

Inha Kang, Eunki Kim, Wonjeong Ryu, Jaeyo Shin, Seungjun Yu, Yoon-Hee Kang, Seongeun Jeong, Eunhye Kim, Soontae Kim, Hyunjung Shim

Published 2026-03-24
📖 6 min read🧠 Deep dive

The Big Problem: The "Crystal Ball" That Cracks

Imagine you are a mayor trying to protect your city from a toxic fog (pollution). You need to know 48 to 120 hours (2 to 5 days) in advance if the air will be safe or deadly.

Currently, the "super-forecasters" (global AI models like Aurora) are like a weatherman who has seen the whole world but never visited your specific town.

  • The Issue: They are great at general patterns but terrible at local details. In East Asia, where the terrain is complex and pollution is intense, these global models often miss the mark.
  • The Consequence: They either say "It's going to be a disaster!" when it's actually fine (causing panic and loss of trust), or they say "It's fine!" when a toxic cloud is actually rolling in (putting people's health at risk).

The Solution: FAKER-Air

The researchers built a new system called FAKER-Air. Think of it as training a local expert who knows your neighborhood better than anyone else, using a special two-step training method.

Step 1: The "Local Map" (The Dataset)

Before teaching the AI how to predict, they had to give it the right map.

  • The Old Map: Global models used data that was 5 days old and averaged out over huge areas. It was like trying to navigate a city using a map of the whole continent.
  • The New Map (CMAQ-OBS): The team created a brand new, high-definition map specifically for East Asia. They combined real-time ground sensors (like having a weather station on every street corner) with a physics-based simulation (a super-accurate computer model of how wind and pollution move).
  • The Result: This new map is 60% more accurate than the old global maps. It's like upgrading from a blurry satellite photo to a live 4K drone feed of your city.

Step 2: The "Two-Stage Training" (The Coach)

Now, they had to teach the AI how to use this map without making the same mistakes as the global models. They used a two-stage coaching method:

Stage 1: The "Drill Sergeant" (Supervised Fine-Tuning)

  • The Problem: If you teach a student to solve a math problem step-by-step, but only show them the correct answer at every step, they will fail when they have to solve it alone. In AI, this is called "Exposure Bias." If the AI makes a tiny mistake at hour 1, it gets worse and worse by hour 100.
  • The Fix: The researchers made the AI practice rolling out its own predictions. Instead of just looking at the correct answer, the AI had to predict hour 1, then use its own prediction to guess hour 2, and so on.
  • The Analogy: It's like teaching a driver not just to look at the road, but to steer the car, then look at where they steered, and then steer again. This stops the AI from falling apart after a few hours.

Stage 2: The "Smart Coach" (Group-Relative Policy Optimization - GRPO)

  • The Problem: Even with good practice, the AI was still too "nervous." It kept predicting "Disaster!" even when the air was clean. Why? Because in the real world, missing a disaster is worse than a false alarm. But standard math treats both errors the same.
  • The Fix: They introduced a new way of grading called GRPO.
    • How it works: Instead of asking the AI to give one answer, they ask it to generate five different possible futures for the same day.
    • The Comparison: The AI then compares these five futures. "Hey, Scenario A said it would be clean, and it was. Scenario B said it would be toxic, but it wasn't. Scenario B was a false alarm!"
    • The Reward: The AI gets a "gold star" for the scenarios that matched reality and a "time-out" for the ones that caused false alarms.
    • The Analogy: Imagine a coach watching a player practice penalty kicks. Instead of just saying "Good shot," the coach says, "You tried 5 shots. Three went wide (false alarms), one hit the post (missed event), and one went in (correct). Let's focus on hitting the target without wasting energy on shots that clearly miss."
    • The Result: The AI learns to be confident but cautious. It stops crying wolf (false alarms) but still screams when a real tiger is coming (severe pollution).

The "Curriculum" (Learning to Walk Before Running)

One final trick they used was Curriculum Rollout.

  • The Idea: You wouldn't ask a baby to run a marathon on day one.
  • The Method: They started by teaching the AI to predict just 6 hours ahead. Once it got good at that, they extended it to 12 hours, then 24, all the way up to 120 hours.
  • Why: This prevents the AI from getting overwhelmed by the complexity of a 5-day forecast before it has mastered the basics.

The Final Scorecard

When they tested FAKER-Air against the global champion (Aurora):

  • False Alarms: Dropped by 47%. (The AI stopped crying wolf, so people actually listened when it did warn them).
  • Accuracy: Improved significantly, especially for long-term forecasts (2 to 5 days out).
  • Reliability: It successfully predicted complex pollution events that the global models completely missed, including pollution traveling across borders.

In a Nutshell

The world's best global AI models are like general practitioners who know a little about everything but aren't great at your specific local problem. FAKER-Air is a specialist doctor who:

  1. Has a local map (CMAQ-OBS dataset) specific to East Asia.
  2. Practices self-correction (Temporal Accumulation) so small mistakes don't snowball.
  3. Learns from comparing multiple guesses (GRPO) to understand that "crying wolf" is bad, but "missing a tiger" is worse.

This creates a system that is ready for real-world use, helping governments issue timely, trustworthy warnings to protect public health.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →