Emergence of Internal State-Modulated Swarming in Multi-Agent Patch Foraging System

This paper demonstrates through evolutionary simulation that self-propelled foragers with partial observability can learn a shared recurrent neural network policy that not only enables adaptive foraging but also leads to the emergence of risk-sensitive swarming behavior, where agents aggregate more strongly when their internal resource levels are low.

Siddharth Chaturvedi, Ahmed EL-Gazzar, Marcel van Gerven

Published 2026-04-09
📖 5 min read🧠 Deep dive

The Big Idea: How Hungry Robots Learn to Huddle

Imagine a vast, empty field scattered with patches of delicious, glowing berries. Now, imagine hundreds of little robots (let's call them "Foragers") are dropped into this field. Their only goal is to eat as many berries as possible to survive. They can't talk to each other, they can't see the whole map, and they can't see what's behind a wall. They only have a few "flashlights" (rays) they can shine in front of them to see what's nearby.

The researchers wanted to answer a simple question: If these robots are just trying to eat, will they naturally start sticking together in a group (swarming), and does their hunger level change how tightly they huddle?

The Setup: The "Blind" Foragers

Think of these foragers like moths in a dark room.

  • The Sensors: Each robot has a few flashlights. If a flashlight hits a berry patch, it sees the berries. If it hits another robot, it sees a robot. But if a robot is standing behind another robot, or if the flashlight beam misses the gap between two objects, that robot is "invisible." This is called partial observability.
  • The Brain: Each robot has a tiny, simple brain (a neural network) that decides how fast to move and which way to turn based on what its flashlights see.
  • The Hunger Meter: Inside each robot is a "battery" representing its energy or food stores. If the battery is low, the robot is hungry. If it's high, the robot is full.

The Experiment: Evolution in a Video Game

The researchers didn't program the robots to swarm. Instead, they used a method similar to natural selection (like breeding the fastest horses).

  1. They created 300 robots with slightly different "brains."
  2. They let them loose in the berry field.
  3. The robots that ate the most berries got to "reproduce" (their brain settings were copied and slightly tweaked for the next generation).
  4. The robots that starved were discarded.
  5. They repeated this for thousands of generations.

The Results: The Magic of Huddling

1. The "Sherlock Holmes" Effect

At first, the robots just wandered aimlessly. But after thousands of generations, they became experts. They learned to find the berry patches and stay there.
Here is the cool part: They started swarming.
Even though they weren't programmed to hold hands, they started clustering together. Why?

  • The Logic: If I see another robot standing still near a spot, it's a huge clue that there are berries there! Even if I can't see the berries myself (maybe they are hidden behind a hill), seeing a friend suggests, "Hey, food is over there."
  • The Metaphor: It's like walking into a dark room and seeing a group of people staring intently at a corner. You don't need to see the object they are looking at; you just know, "There must be something interesting over there," so you walk toward them.

2. The Hunger Factor (Risk-Sensitive Foraging)

The researchers then tested what happens when they remove the berries entirely. The robots still swarmed! But the tightness of the swarm changed based on how hungry they were.

  • The Full Robot: If a robot is full (high battery), it acts like a rich, cautious person. It doesn't need to crowd. It can afford to wander alone because it has a safety net. It keeps its distance from others.
  • The Hungry Robot: If a robot is starving (low battery), it acts like a desperate person. It needs to find food now. It clings tightly to the group. It thinks, "If everyone else is here, there must be food, and I can't afford to be left behind."
  • The Finding: The hungrier the robot, the tighter the swarm. The fuller the robot, the more space it keeps.

3. The "Urgency" Switch

To prove this wasn't just a coincidence, the researchers did a "brain surgery" on the simulation. They looked inside the robots' brains and found specific "neurons" (tiny switches) that tracked how hungry the robot was.

  • The Experiment: They took a robot that was actually full, but they forced its brain to think it was starving.
  • The Result: The robot immediately started rushing toward the other robots, acting desperate.
  • The Metaphor: It's like a person who just ate a huge meal but is given a fake "low battery" warning on their phone. Suddenly, they start running around looking for an outlet, even though they aren't actually tired. This proved that the "hunger signal" in the brain directly controls the urge to swarm.

Why Does This Matter?

This paper is a bridge between biology and robotics.

  • In Nature: It explains why animals (like fish or birds) might huddle together not just to protect themselves from predators, but because being near others is a smart way to find food when you can't see it all.
  • In Robotics: It shows that you don't need to program complex rules for a swarm of drones to work together. If you give them simple rules to "eat" and "survive," they will naturally figure out how to cooperate and huddle based on their own internal needs.

The Takeaway

The paper shows that swarming isn't just about following orders; it's about sharing information. When you are hungry, you stick close to the group because their presence is a signal that food is near. When you are full, you can afford to be independent. The researchers proved that this complex social behavior can emerge from simple, local rules without any central boss telling everyone what to do.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →