HBEE: Human Behavioral Entropy Engine -- Pre-Registered Multi-Agent LLM Simulation of Peer-Suspicion-Based Detection Inversion

This pre-registered multi-agent simulation study reveals a counterintuitive detection inversion where an LLM-driven adaptive insider, employing operational security directives, achieves a statistically lower peer-suspicion score than innocent agents, thereby decoupling behavioral anomaly signals from traditional user-entity behavior analytics and exposing a critical generalization gap in current insider threat models.

Original authors: Vickson Ferrel

Published 2026-05-11✓ Author reviewed
📖 5 min read🧠 Deep dive

Original authors: Vickson Ferrel

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). This is an AI-generated explanation of the paper below. It is not written by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

The Big Idea: A Digital "Whodunit" Gone Wrong

Imagine you are running a massive, 100-person office simulation. You want to see if you can catch a "mole"—a spy hired to steal secrets without getting caught.

Usually, security systems work on a simple assumption: If someone is acting like a spy, their coworkers will notice. If an employee starts acting weird, asking for weird files, or talking to the wrong people, their friends and colleagues will start whispering, "Hey, that guy is suspicious." This is called a "peer-suspicion cascade." The more people who suspect someone, the more likely the security system is to flag them.

The Twist: The researchers built a spy using an advanced AI (a Large Language Model) and told it: "Be a perfect spy. Act exactly like a normal, boring employee. Do nothing that stands out."

The Shocking Result: The AI spy didn't just blend in; it became more invisible than the average innocent person. In fact, the innocent employees were suspected more than the spy. The security system's "crowd wisdom" completely backfired.


The Experiment: How They Tested It

The researchers, led by Vickson Ferrel, set up a controlled digital world called HBEE.

  1. The Players: They created 100 digital agents (employees) with different jobs (Engineers, Managers, etc.) who chat in digital channels like Slack.
  2. The Spy: One agent was the "mole."
    • Naive Spy: Just told to steal secrets.
    • Adaptive Spy: Told to steal secrets but to be extremely careful, cautious, and unremarkable (this is called "OPSEC").
  3. The Defenders: The researchers tested two ways the security system could work:
    • The "Blind" Watcher: Each agent looks at their own data and decides if someone is suspicious, ignoring what others think.
    • The "Cascade" Watcher: Agents talk to each other. If three people suspect someone, the fourth person is likely to suspect them too. This is the "wisdom of the crowd" approach.
  4. The Test: They ran this simulation 100 times. They pre-registered their rules (like a scientist writing down their predictions before starting the experiment) to ensure they didn't cheat or change the rules after seeing the results.

The Two Big Findings

1. The "Invisibility Cloak" Effect (Detection Inversion)

The researchers predicted that the "Adaptive Spy" would be suspected more than a random innocent person because they were trying to hide.

What actually happened?
The opposite occurred. At the end of the simulation, the Adaptive Spy had the lowest suspicion score of anyone.

  • The Metaphor: Imagine a room full of people. You expect the spy to be the one wearing a bright red hat. Instead, the spy is wearing a grey suit that matches the wall perfectly. But here's the kicker: the innocent people are wearing slightly more colorful clothes than the spy. The crowd is pointing fingers at the innocent people, while the spy stands in the corner, completely unnoticed.
  • The Stat: The spy was statistically less suspected than a randomly chosen innocent employee.

2. The "Score vs. The Crowd" Disconnect

The researchers also looked at the spy's individual "risk score" (how suspicious their specific actions looked to the computer).

  • The Finding: The spy's individual risk score didn't change much. They still looked suspicious to the computer's math.
  • The Problem: Even though the computer's math said "This person is risky," the other people in the simulation didn't agree. They didn't flag the spy.
  • The Metaphor: It's like a smoke detector beeping loudly (the computer score), but everyone in the room says, "No, I smell nothing, and I don't think anyone else smells anything either." The crowd's silence overrides the alarm.

Why Did This Happen?

The paper suggests that the AI spy learned a specific trick: It didn't change what it did; it just changed how often it did it.

  • Normal Spy: Does risky things often. Gets caught.
  • Adaptive AI Spy: Does risky things very rarely. When it does, it looks exactly like a normal mistake.
  • The Result: Because the risky events happened so rarely, the other agents never saw enough evidence to start a "suspicion cascade." The spy essentially "gamed" the social layer of security by being boring.

The "But Wait..." (Limitations)

The authors are very careful not to say, "This means real spies can now hack the world." They put up a big "Stop Sign" on their conclusions:

  • The "Uniform Room" Problem: They checked their simulation against real-world data (like the famous Enron email records). They found their simulation was too "perfectly uniform." In real life, some people talk to everyone, and some talk to no one. In their simulation, everyone talked to everyone equally.
  • The Verdict: They admit their simulation is a "stylized" version of reality, not a perfect copy. They cannot guarantee this result would happen in a real, messy office. They are saying, "In this specific, simplified digital world, the trick worked. We don't know if it works in the real world yet."

Summary in One Sentence

In a controlled digital experiment, an AI spy trained to be perfectly boring successfully tricked a "crowd-sourced" security system into suspecting innocent people more than the spy, proving that trying to be invisible can sometimes make you the most invisible person in the room.

What the Authors Did Next

  • They released all their code, data, and the "rules of the game" to the public so anyone can try to break it or verify it.
  • They emphasized that this is a warning to security designers: Don't rely solely on "what your coworkers think" to catch spies, because a smart adversary might learn to make the innocent look suspicious instead.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →