Adaptive Personalized Federated Reinforcement Learning for RIS-Assisted Aerial Relays in SAGINs with Fluid Antennas

This paper proposes an adaptive personalized federated reinforcement learning algorithm to optimize UAV trajectories and RIS phase controls for maximizing downlink rates in dynamic, heterogeneous Space-air-ground integrated networks (SAGINs) that integrate LEO satellites, RIS-assisted UAV relays, and fluid antenna systems.

Yuxuan Yang, Bin Lyu, Abbas Jamalipour

Published 2026-03-06
📖 5 min read🧠 Deep dive

Imagine a future where your phone never loses signal, whether you are on a mountain peak, in a crowded city, or out at sea. This is the goal of SAGINs (Space-Air-Ground Integrated Networks). Think of it as a three-layered internet:

  1. Space: Satellites zooming overhead like a swarm of bees.
  2. Air: Drones (UAVs) acting as flying cell towers.
  3. Ground: Your devices and the people using them.

The problem? The world is messy. Clouds block signals, buildings reflect them weirdly, and every group of people (a "hotspot") has different needs. Some people have super-advanced antennas that can "wiggle" to find the best signal (called Fluid Antennas), while others have standard ones.

This paper proposes a smart way to manage this chaos using AI, flying drones, and mirrors. Here is the breakdown in simple terms:

1. The Cast of Characters

  • The Satellites (The Bosses): They are far away and moving fast. They can't talk to everyone directly because clouds or buildings get in the way. They act as the "Global Brain."
  • The Drones (The Messengers): These are flying cell towers hovering over specific neighborhoods. They carry a special RIS (Reconfigurable Intelligent Surface).
    • Analogy: Think of the RIS as a smart mirror on the drone. It can catch the satellite's signal and bounce it perfectly toward the ground, avoiding obstacles.
  • The Users (The Crowd): Some have "Fluid Antennas" (smart devices that can shift their internal parts to catch the best signal), while others have regular antennas. They are all in different neighborhoods with different crowd sizes and layouts.

2. The Big Problem: "One Size Does Not Fit All"

In the past, AI tried to teach all drones the exact same rules. But that's like trying to teach a surfer in Hawaii and a skier in Alaska the exact same moves using one manual. It doesn't work because the environments are too different.

  • If the AI is too generic, it fails in specific neighborhoods.
  • If every drone learns alone, they are slow and inefficient because they don't share what they learn.

3. The Solution: "Personalized Federated Learning"

The authors created a new AI system called FedPG-AP. Let's break down the name with an analogy:

  • Federated Learning (The Group Study): Imagine a group of students (drones) taking a test. Instead of sending their answers to a teacher to grade (which takes too long and leaks privacy), they keep their answers private. They only send their study notes (the AI model) to a central server. The server mixes all the notes to create a "Master Study Guide" and sends it back.
  • Personalized (The Customization): This is the magic part. The "Master Study Guide" is too general. So, the system allows each student to keep the parts of the guide that work for their specific subject and swap out the parts that don't.
    • Analogy: If Drone A is over a dense city with tall buildings, it keeps the "City Navigation" chapters from the Master Guide but replaces the "Open Field" chapters with its own local experience. If Drone B is over a beach, it does the opposite.
  • Adaptive (The Chameleon): The system is smart enough to know when to customize. If a drone is struggling, it leans more on its own experience. If it's doing well, it leans more on the group's wisdom. It constantly adjusts the balance.

4. How It Works in Real Life

  1. The Setup: A satellite beams a signal to a drone. The drone uses its "smart mirror" (RIS) to bounce the signal down to users.
  2. The Decision: The drone has to decide: Where should I fly? How should I angle my mirror? Which port should the Fluid Antenna user pick?
  3. The Learning:
    • The drone tries a move. If the internet speed goes up, it gets a "reward."
    • It updates its own "brain" (local model).
    • It sends its brain updates to the satellite (the server).
    • The satellite mixes everyone's brains to make a better "Global Brain."
    • The satellite sends the Global Brain back, but the drone personalizes it immediately to fit its specific neighborhood before trying again.

5. The Results

The paper ran thousands of simulations (like a video game) to test this.

  • Without Personalization: The drones were confused and unstable. Sometimes they flew into bad spots.
  • With Fixed Personalization: They were stable but slow to learn new tricks.
  • With Their New "Adaptive Personalized" System: The drones learned the fastest, stayed the most stable, and gave the highest internet speeds to everyone, even when the environment was chaotic.

The Takeaway

This paper is about teaching a fleet of flying drones to be smart, cooperative, but also individually adaptable. By using a "Group Study" method where everyone shares what they know but keeps their own "specialty," they can provide perfect internet coverage to everyone, everywhere, regardless of how messy the environment gets. It's the difference between a rigid robot army and a team of flexible, intelligent teammates.