Assessing Cognitive Biases in LLMs for Judicial Decision Support: Virtuous Victim and Halo Effects

Imagine you are hiring a new judge for a courtroom. But instead of a human, you're considering a super-smart computer program (an AI) to help make the final decisions on who goes to jail and for how long.

The big question is: Is this computer judge fairer than a human judge, or does it have the same bad habits?

Humans are notoriously bad at being perfectly fair. We get tired, we get hungry, and we let things like a person's job title or their "victim status" cloud our judgment. This paper is like a report card testing five different AI models to see if they make these same mistakes.

Here is the breakdown of the study using some everyday analogies:

1. The "Good Victim" Trap (The Virtuous Victim Effect)

The Human Flaw: Imagine a person gets hurt. If they are a "perfect" victim (someone who never did anything wrong), we tend to think they are a saint. But if that same person had a complicated history with the person who hurt them (like they were friends or had a prior relationship), we suddenly start blaming the victim more, even if the harm was exactly the same. It's like thinking, "Well, they knew the guy, so maybe they asked for it."

The AI Test: The researchers asked the AI to judge two scenarios: one where a student's iPad was broken by a stranger, and another where a woman was assaulted by someone she had previously been intimate with (but then stopped).
The Result: The AI was too nice to victims. It rated victims as "more moral" than non-victims, even more so than humans do. However, unlike humans, the AI did not punish the victims for having a prior relationship. It didn't fall for the "they knew him, so it's their fault" trap.

The Analogy: If a human judge is a grumpy parent who blames the kid for playing with the wrong friend, the AI is an over-protective robot who thinks all victims are angels, but refuses to play the "blame game" based on past relationships.

2. The "Fancy Resume" Trap (The Halo Effect)

The Human Flaw: Humans love status. If a criminal is a famous doctor or works for a huge, prestigious company (like Goldman Sachs), we tend to give them a lighter sentence or think they are less guilty. If they are a janitor or work for a small local shop, we are harsher. It's like judging a book by its shiny cover.

The AI Test: The researchers gave the AI cases where the only difference was the defendant's job or company.

Company Prestige: When the defendant was a "Prestigious Company," humans demanded 3x more money in damages. The AI demanded more too, but usually less than humans (about 1.5x to 2x).
Job Title: When the criminal was a "Doctor" vs. a "Receptionist," humans gave the doctor a much lighter sentence. The AI was mostly indifferent, though one model (DeepSeek) acted a bit like a human and gave the doctor a break.
Credentials: When an expert witness came from a fancy Ivy League school vs. a regular state school, humans were easily swayed. The AI was much less swayed by the fancy school name.
The Analogy: Humans are like fans who buy a ticket just because the band is famous. The AI is more like a music critic who listens to the actual song, though it still gets a little distracted by the band's fame.

3. The "Rollercoaster" Problem (Consistency)

This is the most worrying part. Even if the AI is on average fairer than humans, it is unpredictable.

In one run, the AI might say a company owes $20 million. In the next run, with the exact same facts, it might say $300 million.
Some models (like Claude) refused to answer certain questions because they were "too sensitive."
The Analogy: Imagine a human judge who is grumpy in the morning but happy in the afternoon. Now imagine an AI judge that is a slot machine. Sometimes it pays out a fair verdict; sometimes it pays out a crazy one. You can't build a justice system on a slot machine.

The Final Verdict

Are AI judges ready to replace humans?
Not yet.

The Good News: The AI is generally less obsessed with fancy job titles and degrees than humans are. It doesn't fall for the "blame the victim for their past" trap.
The Bad News: The AI is too "pro-victim" (it thinks victims are saints even when they aren't), and its decisions are all over the place. One minute it's fair, the next it's wild.

In short: The AI is a student who studied hard and knows the rules better than the teacher (the human) regarding status and resumes, but it's still a bit too emotional about victims and its math is inconsistent. We need to fix the "slot machine" problem before we let it sit on the bench.

Assessing Cognitive Biases in LLMs for Judicial Decision Support: Virtuous Victim and Halo Effects

1. The "Good Victim" Trap (The Virtuous Victim Effect)

2. The "Fancy Resume" Trap (The Halo Effect)

3. The "Rollercoaster" Problem (Consistency)

The Final Verdict

1. Problem Statement

2. Methodology

A. Experimental Design

B. Scenarios and Biases Tested

C. Models Evaluated

D. Evaluation Protocol

3. Key Results

A. Virtuous Victim Effect (VVE)

B. Halo Effects

C. Model Variability

4. Key Contributions

5. Significance and Limitations

Assessing Cognitive Biases in LLMs for Judicial Decision Support: Virtuous Victim and Halo Effects

1. The "Good Victim" Trap (The Virtuous Victim Effect)

2. The "Fancy Resume" Trap (The Halo Effect)

3. The "Rollercoaster" Problem (Consistency)

The Final Verdict

1. Problem Statement

2. Methodology

A. Experimental Design

B. Scenarios and Biases Tested

C. Models Evaluated

D. Evaluation Protocol

3. Key Results

A. Virtuous Victim Effect (VVE)

B. Halo Effects

C. Model Variability

4. Key Contributions

5. Significance and Limitations

More like this

Monotone Comparative Statics without Lattices

Motion Illusions Generated Using Predictive Neural Networks Also Fool Humans

Performance Analysis of IEEE 802.11p Preamble Insertion in C-V2X Sidelink Signals for Co-Channel Coexistence

Construction of time-varying ISS-Lyapunov Functions for Impulsive Systems

Real-Time BDI Agents: a model and its implementation