BioLLMAgent: A Hybrid Framework with Enhanced Structural Interpretability for Simulating Human Decision-Making in Computational Psychiatry

Imagine you are trying to understand why people make bad decisions, like a gambler who keeps losing money or someone struggling with addiction. Scientists have been trying to build computer programs (agents) to simulate these people, but they've been stuck in a "choose your own adventure" dilemma with two very different options:

The Math Whiz (Reinforcement Learning): This agent is like a super-smart accountant. It follows strict rules and math to learn from wins and losses. It's very clear how it thinks (you can see the numbers), but it's a bit robotic. It doesn't really "act" like a human; it just crunches data.
The Method Actor (Large Language Model): This agent is like a brilliant actor who can improvise. It talks, reasons, and behaves exactly like a real person. It's incredibly realistic, but it's a "black box." You have no idea why it made a choice because its brain is a giant, messy cloud of billions of parameters.

The Problem: The Math Whiz is too boring to be realistic, and the Method Actor is too mysterious to be useful for science.

The Solution: BioLLMAgent
The authors of this paper created a hybrid superhero called BioLLMAgent. Think of it as a Cyborg Detective.

How It Works: The Cyborg Detective

Imagine a detective solving a case. This detective has two distinct parts working together:

The Internal Brain (The Math Whiz):
This is the detective's gut instinct, built on years of experience. It learns slowly from trial and error. If the detective picks a card and loses money, this part of the brain says, "Ouch, don't do that again." It's based on a proven mathematical model called ORL (Outcome-Representation Learning). This part ensures the detective is grounded in reality and that we can still see how it learns.
The External Voice (The Method Actor):
This is the detective's ability to listen to advice, read a manual, or remember a therapist's words. It's powered by a Large Language Model (LLM). If a therapist says, "Hey, the shiny red decks are traps," this part of the brain processes that instruction and says, "Okay, I should avoid the red decks." This adds the "human" flavor, the ability to understand context and follow complex instructions.
The Fusion Mechanism (The Chief of Police):
This is the most important part. The Chief sits between the Internal Brain and the External Voice. Every time the detective has to make a choice, the Chief asks both sides for their opinion.
- The Internal Brain says: "I think Deck A is okay because I won last time."
- The External Voice says: "But the therapist told me Deck A is a trap!"
- The Chief weighs these two opinions (maybe 75% gut instinct, 25% advice) and makes the final decision.

What Did They Discover?

The researchers tested this Cyborg Detective using the Iowa Gambling Task, which is like a video game where you pick cards from four decks. Some decks give you lots of money at first but make you lose big later (the "bad" decks), while others give you small wins but keep you safe in the long run (the "good" decks).

Here is what they found:

It Acts Human: The BioLLMAgent behaved almost exactly like real humans, including those with addiction issues. It could mimic the "bad habits" of addicts who keep choosing the risky decks.
It's Still Scientific: Even though it uses a fancy AI, the "Internal Brain" part still works like a normal math model. Scientists can still look at the numbers and say, "Ah, this agent is bad at learning from losses," which helps them understand the psychology behind addiction.
It Can Be "Therapized": This is the coolest part. The researchers gave the "External Voice" a prompt that acted like a Cognitive Behavioral Therapy (CBT) session. They told the agent, "Remember, small steady wins are better than big risky ones."
- Result: The agent immediately started making better choices! It showed that if you give the "External Voice" the right advice, it can override the "Internal Brain's" bad habits.
Community vs. Individual: They simulated a whole town of these agents. They found that giving everyone in the town a little bit of advice (Community Education) worked much better than trying to fix just the "worst" individuals. It's like how washing your hands helps stop a flu outbreak better than just treating the sickest person.

Why Does This Matter?

Think of BioLLMAgent as a Virtual Sandbox for psychiatry.

Before this, if a doctor wanted to test a new therapy, they had to wait years to find real patients and run expensive trials. Now, they can build a "virtual patient" in the computer, give it a specific personality, and test different therapies instantly.

For Doctors: It helps them understand why a patient is stuck in a bad loop. Is it a learning problem (Internal Brain) or a lack of guidance (External Voice)?
For Society: It suggests that sometimes, the best way to help a community isn't to target the "problem people," but to educate everyone.

In short, BioLLMAgent combines the best of both worlds: the clarity of math and the realism of human conversation, giving scientists a powerful new tool to solve the mystery of the human mind.

1. Problem Statement

Computational psychiatry faces a fundamental trade-off between interpretability and behavioral realism:

Traditional Reinforcement Learning (RL) Models: Models like Prospect Valence Learning (PVL) and Outcome-Representation Learning (ORL) offer mathematical interpretability, linking cognitive deficits (e.g., in addiction) to specific parameters. However, they often fail to capture the contextual nuances, variability, and narrative elements of human behavior, limiting their ability to generate realistic "digital subjects."
Large Language Model (LLM) Agents: LLMs can generate highly realistic, context-aware behaviors and simulate complex social interactions. However, they function as "black boxes" with billions of opaque parameters, lacking structural interpretability. Their decision-making processes are not grounded in validated psychological or neuroscientific theories, making them unsuitable for rigorous mechanistic discovery.

The Gap: There is a need for a framework that combines the structural interpretability of computational models with the behavioral realism of LLMs to simulate psychiatric conditions and test interventions.

2. Methodology: The BioLLMAgent Framework

The authors propose BioLLMAgent, a hybrid framework that integrates an interpretable RL engine with an LLM shell via a decision fusion mechanism. The framework operates on the Iowa Gambling Task (IGT) and Delay Discounting tasks.

Core Architecture

The framework consists of three modular components:

Internal RL Engine (Endogenous Drive):
- Model: Implements the Outcome-Representation Learning (ORL) model, chosen for its ability to separate learning into Expected Value (EV), Expected Frequency (EF), and Perseveration (PS).
- Function: Simulates experience-driven value learning based on direct environmental interaction (trial-and-error).
- Parameters: Includes core cognitive parameters such as reward/punishment learning rates ( $A_{rew}, A_{pun}$ ), forgetting/decay ( $K$ ), and weights for frequency/perseveration ( $\beta_F, \beta_P$ ).
- Interpretability: These parameters are estimated via Bayesian inference, serving as cognitive biomarkers.
External LLM Shell (Exogenous Drive):
- Function: Captures high-level cognitive strategies, situational reasoning, or external instructions (e.g., therapeutic advice) using natural language prompts.
- Mechanism: The LLM acts as a "pure agent" to simulate a full IGT session under a specific persona. Its output (action probability distributions) is aggregated into a static prior ( $\Pi_{prob}$ ) and converted into a utility vector ( $\Pi_{util}$ ).
- Role: Represents stable dispositions, personality traits, or therapeutic guidance (e.g., CBT principles) rather than dynamic learning.
Decision Fusion Mechanism:
- Integration: Combines the dynamic internal utility ( $U_{RL}$ ) and the static external prior ( $\Pi_{util}$ ) using a weighted average:
  $U_{Combined, t}(a) = (1 - \omega) \cdot U_{RL, t}(a) + \omega \cdot \Pi_{util}(a)$
- Hyperparameter ( $\omega$ ): Controls the balance between internal learning and external guidance ( $0 \le \omega \le 1$ ).
- Action Selection: The final utility vector is passed through a Softmax function with an inverse temperature parameter ( $\theta$ ) to generate the final action probability.

Experimental Setup

Datasets: Validated on six datasets (N=350) including Healthy Controls (HC), Amphetamine users, and Heroin users.
LLM Backends: Tested with GPT-4o, DeepSeek, Llama-3.2, and Gemma-3.
Conditions: Pure ORL, BioLLMAgent (Neutral), BioLLMAgent (Noisy), and BioLLMAgent (CBT Intervention).

3. Key Contributions

Novel Hybrid Architecture: First framework to explicitly embed a validated, interpretable cognitive model (ORL) inside an LLM agent, allowing for the separation of "internal learning" and "external influence."
Structural Interpretability: Unlike pure LLM agents, BioLLMAgent allows researchers to quantify cognitive parameters (e.g., loss aversion) and trace decisions back to specific model components.
Controllable Simulation: Demonstrates the ability to inject specific interventions (like CBT principles) via natural language prompts and observe their quantitative impact on decision-making trajectories.
Multi-Agent Social Dynamics: Extends the framework to network-level simulations to test population-level intervention strategies.

4. Key Results

A. Behavioral Fidelity and Consistency

Trajectory Matching: BioLLMAgent (with neutral priors) closely reproduces the behavioral trajectories of the pure ORL model and human data across all six datasets.
Correlation: High Pearson correlation ( $r > 0.95$ ) between BioLLMAgent predictions and pure ORL predictions, confirming that the LLM component does not obscure the underlying cognitive mechanisms.
Parameter Identifiability: Core cognitive parameters ( $A_{rew}, A_{pun}, K, \beta_F, \beta_P$ ) showed strong recovery correlations ( $r > 0.67$ , with $\beta$ parameters often $>0.84$ ), proving the framework remains scientifically rigorous.

B. Controllability and LLM Scale

Instruction Following: Large models (GPT-4o, DeepSeek) successfully followed instructions to adopt specific personas (e.g., uniform selection or CBT-guided strategies). Smaller models (Llama-3.2, Gemma-3) exhibited "instruction resistance," failing to suppress pre-trained biases.
Sensitivity to $\omega$ :
- An optimal $\omega \approx 0.25$ balanced performance gains with robustness to incorrect priors.
- Population Specificity: Clinical populations (addiction) showed steeper sensitivity to external guidance ( $\omega$ ) compared to healthy controls, suggesting $\omega$ could serve as a biomarker for susceptibility to therapeutic guidance.

C. Intervention Simulations

CBT Encoding: Encoding CBT principles into prompts shifted simulated behavior toward advantageous decks, with the strongest effects observed in addiction populations.
Network-Level Interventions: In a 100-agent network simulation, Community Education (100% coverage) outperformed Targeted CBT (20% worst performers) and Hub interventions. Community education achieved a health score of 0.950, significantly higher than targeted approaches (0.750), suggesting broad educational interventions may be more effective than individual treatments in this context.

D. Generalization

The framework successfully generalized to the Delay Discounting task by swapping the ORL engine for a Hyperbolic Discounting model while retaining the LLM shell and fusion mechanism, validating the modularity of the approach.

5. Significance and Future Directions

Computational Sandbox: BioLLMAgent provides a powerful tool for "in silico" testing of theoretical hypotheses and intervention strategies before costly clinical trials.
Mechanistic Understanding: It bridges the gap between abstract mathematical models and realistic human behavior, offering a new way to study decision-making deficits in addiction and impulsivity.
Hypothesis Generation: The framework generated novel hypotheses, such as the superiority of community-wide interventions over targeted treatments and the potential of $\omega$ as a clinical biomarker.
Limitations: The framework currently relies on large-scale LLMs (>70B parameters) for instruction following. The static prior assumption simplifies cognitive dynamics, and the CBT simulation is a proof-of-concept for behavioral shifts, not a validated clinical therapy model.

Conclusion: BioLLMAgent represents a significant step forward in computational psychiatry by resolving the interpretability-realism trade-off, enabling the creation of "virtual patients" that are both behaviorally realistic and mechanistically transparent.