Cooperative Deep Reinforcement Learning for Fair RIS Allocation

This paper proposes a fairness-aware collaborative multi-agent reinforcement learning framework that utilizes a simultaneous ascending auction mechanism to dynamically allocate shared reconfigurable intelligent surfaces (RISs) among base stations, effectively balancing overall network throughput with improved service rates for the worst-performing users.

Martin Mark Zan, Stefan Schwarz

Published 2026-03-27
📖 4 min read☕ Coffee break read

Imagine a bustling city where two major pizza delivery hubs (Base Stations) are trying to feed hungry customers (Users). One hub is in a wealthy, low-density neighborhood with few customers, while the other is in a crowded, high-density district with three times as many people. Naturally, the crowded hub is overwhelmed, and its customers are waiting a long time for their food.

To fix this, the city installs Reconfigurable Intelligent Surfaces (RIS). Think of these RISs as smart, magical mirrors placed along the street. These mirrors can catch a signal from a pizza hub and bounce it perfectly to a customer's window, even if the direct line of sight is blocked by buildings.

However, there's a problem: There are only 10 mirrors, but they are located right on the border between the two neighborhoods. Both pizza hubs want them because they make deliveries faster. If the crowded hub doesn't get enough mirrors, its customers starve. If the empty hub gets too many, it's a waste of resources.

The Solution: A Smart Auction with a "Fairness Coach"

The authors of this paper propose a system to solve this using two main ingredients: Auctions and AI Learning.

1. The Auction (The Marketplace)

Instead of a central boss deciding who gets which mirror, the hubs participate in a simultaneous auction.

  • The price of a mirror goes up in small steps.
  • Both hubs bid on the mirrors they want.
  • If only one hub bids, they get it. If both bid, they keep fighting until the price gets too high for one of them to care.

2. The Problem with Standard Auctions

In a normal auction, the hub with more money or better strategy might win everything. In our pizza analogy, the empty hub might win all the mirrors just because it's "richer" in the moment, leaving the crowded hub with nothing. This is efficient (mirrors are used) but unfair (some customers go hungry).

3. The AI "Fairness Coach" (Cooperative Deep Reinforcement Learning)

This is where the paper's magic happens. The authors teach the two pizza hubs to act like smart, cooperative agents using Artificial Intelligence (Deep Reinforcement Learning).

Here is how the AI learns to be fair:

  • The "Fairness Weight": Before every round of bidding, a central computer looks at how well each hub is doing. If Hub A (the crowded one) is struggling, the computer gives it a "Fairness Boost" (a special weight).
  • The Strategy Change: The AI learns that if it is the struggling hub, it should bid more aggressively because the system wants it to win. If it is the already-successful hub, the AI learns to be more conservative, realizing that "winning" isn't as critical as helping the other guy.
  • No Talking Required: The hubs don't need to call each other on the phone to coordinate. They just look at the "Fairness Weight" provided by the auctioneer and adjust their bidding strategy automatically.

The Result: A Balanced City

The paper ran simulations to see what happens when they turn up the "Fairness Knob" (a parameter called γ\gamma).

  • Without the Fairness Knob: The hubs fight for mirrors based purely on who can get the most total speed. The crowded hub might still be slow.
  • With the Fairness Knob: The AI learns to shift mirrors toward the struggling hub.
    • The Good News: The customers in the crowded neighborhood get their pizza much faster (their "minimum rate" improves by 34%).
    • The Trade-off: The total speed of the whole city drops very slightly (less than 7%).

The Big Picture Metaphor

Think of the network as a team of runners in a relay race.

  • Old Way: Everyone runs as fast as they can individually. The fast runners finish early and wait, while the slow runners struggle to keep up. The total time is good, but the slow runners are miserable.
  • New Way (This Paper): The team has a coach (the AI) who tells the fast runners, "Slow down a bit and pass the baton to the slow runner so they can catch up." The fast runner doesn't lose much time, but the slow runner finishes the race much faster. The team's overall time is almost the same, but no one is left behind.

Why This Matters

As we move toward 6G (the next generation of internet), we will have many more devices and "smart mirrors" (RIS). This paper shows that we can use AI and auctions to automatically balance the network. We can ensure that people in bad signal areas get a fair share of the technology, without ruining the internet speed for everyone else. It's a way to make the future internet both fast and fair.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →