Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Imagine a high-stakes game of "Tag" played in a massive, three-dimensional maze made of giant floating blocks (like a voxel game). You have four drones (the pursuers) trying to catch one fast, agile drone (the evader).

The problem? The maze is full of obstacles, the drones can't turn instantly, and—most importantly—they cannot talk to each other. In the real world, radio signals get delayed, blocked, or corrupted by noise. If the drones rely on hearing each other to coordinate, a split-second delay could cause them to crash into each other or miss the target entirely.

This paper asks a counter-intuitive question: What if the drones are too smart? What if giving them too much information about their teammates actually makes them worse at catching the target when communication is bad?

Here is the breakdown of their solution, "Less is More," using simple analogies.

1. The Problem: The "Over-Connected" Team

In many AI systems, robots are taught to constantly share their thoughts: "I see the target!" "I'm turning left!" "Watch out!"

The authors argue that in a noisy, delayed environment, this is like trying to run a relay race while everyone is shouting instructions over a walkie-talkie with a bad signal.

The Issue: If Drone A hears a message from Drone B that is 0.5 seconds old, Drone A might turn left based on that old info, only to realize Drone B has already turned right. This "stale information" causes confusion and crashes.
The Old Way: Give the AI more data (83 dimensions of information) about where every teammate is.
The Result: The AI gets overwhelmed by outdated, noisy data and starts making mistakes.

2. The Solution: "Representational Parsimony" (The Blindfolded Team)

The authors decided to try the opposite: Give the drones less information.

They created a "Parsimonious" (simple) version where the drones are effectively blindfolded regarding their teammates. They only see:

Where they are.
Where the target is (if visible).
The general shape of the maze (a map).
They do NOT see where their teammates are.

The Analogy: Imagine a group of hikers trying to find a lost friend in a dense forest.

The "Rich" Team: Everyone is constantly shouting, "I'm here!" "I'm over there!" But the wind (noise) distorts the voices. They get confused, run in circles, and trip over each other.
The "Parsimonious" Team: They agree to a simple rule: "Everyone move toward the center of the forest, but keep your eyes on the ground and the path." They don't talk. They just react to the terrain. Surprisingly, they move more smoothly and catch the target faster because they aren't distracted by confusing, delayed voices.

3. The Secret Sauce: "Contribution-Gated Credit Assignment" (The Fair Coach)

If the drones can't talk, how do they know they are working together? How do they know who gets the "credit" for the catch?

The authors invented a system called CGCA. Think of this as a fair coach who watches the game from a high tower.

The Rule: The coach only gives points if you are actually helping right now.
How it works: If a drone is far away (more than 60 meters) or just hovering around doing nothing, the coach says, "You get zero points for this catch." But if a drone is close, moving fast toward the target, and actively squeezing the target into a corner, the coach gives them a big reward.
The Result: This forces the drones to naturally coordinate without talking. They realize, "If I stay far away, I get no points. I need to get close and help." They self-organize into a perfect formation (like a net) just by chasing the reward.

4. The Results: Why "Less" Won

The team tested this in a brutal simulation (Stage 5) with 4 drones vs. 1 fast evader in a cluttered 3D city.

The "Rich" Team (Full Info): Caught the target 72% of the time but crashed into things 25% of the time. They were confused by the noise.
The "Parsimonious" Team (Simple Info + CGCA): Caught the target 75% of the time and crashed only 22% of the time.
The Stress Test: When they added delays, noise, or made the evader faster, the "Rich" team fell apart. The "Parsimonious" team just slowed down gracefully but kept working.

The Big Takeaway

The paper proves a simple but powerful design principle: In a chaotic, noisy world, simplicity is robust.

By stripping away the complex, fragile connections between robots (the "team-coupled" data) and relying on simple local rules and a fair reward system, the robots became better at working together. They didn't need to talk to coordinate; they just needed to know the rules and the map.

In short: Sometimes, to catch a fast target in a messy world, you don't need a super-connected team. You need a team that knows when to stop listening and start looking at the ground.

Here is a detailed technical summary of the paper "Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony."

1. Problem Statement

The paper addresses asymmetric 3D pursuit-evasion in cluttered voxel environments involving four pursuers and one evader. The core challenge lies in operating under communication constraints (zero-communication), partial observability, sensing noise, and communication latency.

Context: In high-speed aerial scenarios, traditional Multi-Agent Reinforcement Learning (MARL) often relies on rich inter-agent coupling (e.g., sharing teammate states, centralized critics, or explicit messaging).
The Issue: These dependencies become fragility sources when communication is delayed or noisy. Stale peer beliefs propagated through dense coupling channels can destabilize cooperative interception, leading to error cascades.
Research Question: Can representational parsimony (reducing the complexity of agent observations) improve coordination robustness in communication-denied 3D pursuit, rather than relying on richer coupling?

2. Methodology

The authors build upon an inherited path-guided decentralized scaffold (from prior work [1]) which uses 3D A* for global guidance and a decentralized PPO/IPPO actor-critic for local control. The paper introduces two novel components to enhance robustness without communication:

A. Representational Parsimony (Observation Masking)

The authors hypothesize that removing team-coupled information reduces sensitivity to delay and noise.

Original State (83-D): Includes LiDAR, target state, self-state, IMU, teammate states (24-D), guidance, and encirclement topology cues.
Parsimonious State (50-D): A binary masking operator removes the 33 dimensions related to explicit teammate coupling (teammate states, tactical slots, and encirclement cues).
Result: Agents rely solely on local geometry and shared topological guidance, forcing the policy to learn robust behaviors based on local observations rather than potentially stale global estimates.

B. Contribution-Gated Credit Assignment (CGCA)

To sustain cooperation without explicit communication, the authors introduce a locality-aware reward structure that prevents "free-rider" equilibria.

Directional Gating: Rewards for closing distance are active only within specific ranges (40m–80m), decaying as distance increases, to focus on meaningful local interactions.
Capture-Share Gating: The credit for a successful capture is weighted by the closing speed and participation ratio. If fewer than half of the pursuers are actively closing in on the target, the collective capture bonus is down-scaled.
Mechanism: This ensures that agents are incentivized to actively participate in the interception based on local kinematics ( $\dot{d}_i$ ) rather than waiting for others, even without direct messaging.

3. Key Contributions

Design Principle of Representational Parsimony: The paper demonstrates that in communication-constrained 3D pursuit, reducing explicit cross-agent observation coupling (from 83-D to 50-D) improves robustness against delay and noise, contrary to the intuition that "more information is always better."
Contribution-Gated Credit Assignment (CGCA): A novel, lightweight credit assignment mechanism that enables effective zero-communication cooperation by using local distance and kinematic signals to shape incentives and suppress free-riding.
Comprehensive Robustness Evidence: Extensive benchmarking against centralized (MAPPO) and other decentralized baselines, including stress tests on speed, yaw limits, noise, and delay, plus zero-shot transfer to procedurally generated urban canyons.

4. Experimental Results

The evaluation was conducted in Stage-5 (4 vs. 1, 60m visibility, 8m capture radius) in a 52×52×18 voxel grid.

Main Benchmark Performance:
- OURS-LITE (50-D + CGCA): Achieved 0.753 success rate and 0.223 collision rate.
- FULL OBS (83-D, no CGCA): Achieved 0.721 success rate and 0.253 collision rate.
- CTDE MAPPO (Centralized Critic): Collapsed to 0.006 success rate, showing extreme fragility under the visibility gating and clutter conditions.
- Conclusion: The parsimonious approach outperformed the full-observation baseline and drastically outperformed centralized methods.
Ablation Study:
- Removing CGCA from the 50-D setup (Local-No-Gate) caused a significant drop in success (0.753 $\to$ 0.569) and a spike in collisions, proving CGCA is a necessary mechanism, not just a regularizer.
Stress Tests:
- Speed/Yaw/Delay/Noise: OURS-LITE demonstrated graceful degradation under increasing evader speeds, tighter yaw limits, and observation noise. In contrast, FULL OBS and Euclidean baselines suffered sharp performance drops or陷入了 (fell into) inefficient loops.
- Zero-Shot Transfer: The model generalized to procedurally generated "Urban Canyon" maps with varying obstacle densities (up to 0.24), maintaining ~61% success at the highest density without retraining.
Qualitative Analysis:
- The agents developed a three-phase tactical behavior: (1) Initial search guided by 3D A*, (2) Altitude stratification (using vertical anisotropy to block escape), and (3) Topological containment using obstacles as virtual teammates. This occurred without explicit message passing.

5. Significance and Conclusion

The paper challenges the prevailing MARL paradigm that assumes richer inter-agent coupling leads to better coordination. It establishes that in communication-constrained, high-dynamic 3D environments:

Stale coupling is harmful: Dense sharing of teammate states amplifies errors when latency or noise is present.
Sparsity is robust: A sparse, local observation space paired with a smart, locality-aware credit assignment (CGCA) yields more reliable and robust coordination.
Practical Implication: For real-world multi-robot systems where communication is unreliable or non-existent, designers should prioritize representational sparsity and local incentive structures over complex centralized critics or explicit communication protocols.

The work provides a practical blueprint for deploying robust multi-agent pursuit systems in cluttered, real-world 3D environments where communication cannot be guaranteed.

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

1. The Problem: The "Over-Connected" Team

2. The Solution: "Representational Parsimony" (The Blindfolded Team)

3. The Secret Sauce: "Contribution-Gated Credit Assignment" (The Fair Coach)

4. The Results: Why "Less" Won

The Big Takeaway

1. Problem Statement

2. Methodology

A. Representational Parsimony (Observation Masking)

B. Contribution-Gated Credit Assignment (CGCA)

3. Key Contributions

4. Experimental Results

5. Significance and Conclusion

More like this

Monotone Comparative Statics without Lattices

Motion Illusions Generated Using Predictive Neural Networks Also Fool Humans

Performance Analysis of IEEE 802.11p Preamble Insertion in C-V2X Sidelink Signals for Co-Channel Coexistence

Construction of time-varying ISS-Lyapunov Functions for Impulsive Systems

Real-Time BDI Agents: a model and its implementation