Efficient Policy Learning with Hybrid Evaluation-Based Genetic Programming for Uncertain Agile Earth Observation Satellite Scheduling

This paper proposes a Hybrid Evaluation-based Genetic Programming (HE-GP) framework that dynamically switches between exact and approximate evaluation modes within an Online Scheduling Algorithm to efficiently solve the Uncertain Agile Earth Observation Satellite Scheduling Problem, achieving significant computational cost reductions while maintaining superior scheduling performance compared to existing methods.

Junhua Xue, Yuning Chen

Published 2026-03-10
📖 5 min read🧠 Deep dive

Imagine you are the captain of a high-tech, super-fast camera drone (an Agile Earth Observation Satellite) orbiting the Earth. Your job is to take pictures of specific locations (like farms, cities, or disaster zones) to earn "profit points."

However, there are three big problems making your job a nightmare:

  1. The Weather is Unpredictable: Sometimes clouds block the view, or the camera quality drops, meaning you might not get the full profit you expected.
  2. The Battery and Memory are Limited: You can only take so many photos before you run out of space or power.
  3. You Have to Move Fast: To take a picture of one spot, you have to twist and turn your drone. Turning takes time and energy, and you can't just snap a photo instantly.

Your goal is to decide which photos to take and when to turn the drone to maximize your total profit, all while dealing with these uncertainties.

The Old Way: The "Perfect" but Slow Planner

Traditionally, scientists tried to solve this by writing a computer program that checks every single possibility perfectly before making a move.

  • The Problem: This is like trying to plan your entire vacation by checking the weather, traffic, and hotel prices for every single second of the day, for every possible route. It's so slow that by the time you finish planning, the vacation is over. In the satellite world, this "perfect" calculation takes too much computer power, and the satellite's brain isn't strong enough to handle it.

The New Idea: The "Evolving Coach" (Genetic Programming)

Instead of a rigid planner, the researchers used a method called Genetic Programming Hyper-Heuristic (GPHH). Think of this as a team of coaches trying to invent the best rulebook for the drone.

  • They start with a bunch of random rulebooks (e.g., "Always take the closest photo" or "Take the photo with the highest potential profit").
  • They test these rulebooks. The ones that earn more points survive.
  • They mix the best rulebooks together (like breeding) to create new, smarter rulebooks.
  • Over time, the team evolves a "Super Coach" that knows exactly what to do in any situation.

The Catch: To test a rulebook, the computer has to simulate the whole day's flight. Doing this perfectly for every single rulebook takes forever.

The Breakthrough: The "Hybrid Evaluation" (HE-GP)

This is where the paper's main innovation comes in. The researchers realized they didn't need to be perfectly accurate every single time they tested a rulebook. They created a Hybrid Evaluation system, which is like having a Smart Coach with two modes:

  1. The "Rough Draft" Mode (Approximate):

    • When to use it: When the team is just starting out and trying out wild, crazy ideas.
    • How it works: The coach does a quick, "good enough" check. "Hey, that photo looks promising, let's keep it for now." It skips the heavy math to save time.
    • Analogy: It's like skimming a menu to see what looks tasty, rather than reading every ingredient list.
  2. The "Final Exam" Mode (Exact):

    • When to use it: When the team has found some really good rulebooks and needs to pick the absolute winner.
    • How it works: The coach does the full, detailed, math-heavy check to ensure the rulebook is 100% valid and won't crash the satellite.
    • Analogy: It's like reading the fine print on the contract before signing the deal.

The Magic Switch:
The system is smart enough to know when to switch.

  • If the team is diverse and exploring new ideas, it uses the Rough Draft mode to speed things up.
  • If the team is stuck or getting very similar results, it switches to the Final Exam mode to make sure they aren't missing anything important.

Why This Matters

The researchers tested this new "Hybrid Coach" against old methods and found:

  • It's Faster: It cut the training time by about 18%. That's a huge deal when you are dealing with complex space math.
  • It's Smarter: Because it didn't get bogged down in slow calculations, it could explore more ideas and find better solutions than the "perfect but slow" methods.
  • It's Understandable: Unlike some modern AI that acts like a "black box" (you can't tell why it made a decision), this system evolves simple math formulas. You can actually read the rulebook and say, "Ah, I see! It prioritizes photos with high profit and low memory usage."

The Bottom Line

This paper is about teaching a satellite how to make quick, smart decisions in a chaotic, uncertain world. By using a "smart switch" between quick guesses and detailed checks, the researchers created a system that learns faster and performs better, helping our satellites take better photos of Earth without needing a supercomputer the size of a building to do the math.