Measuring Perceptions of Fairness in AI Systems: The Effects of Infra-marginality

Here is an explanation of the paper using simple language and creative analogies.

The Big Idea: Why "Equal" Doesn't Always Mean "Fair"

Imagine you are a judge deciding who gets a scholarship. You have two groups of applicants: Group A and Group B.

For a long time, computer scientists have tried to make AI "fair" by forcing it to treat both groups exactly the same. They want the AI to get the same number of right answers for Group A as it does for Group B. This is called Statistical Parity.

But this paper asks a tricky question: What if Group A and Group B are actually different to begin with?

The authors call this problem "Infra-marginality." It's a fancy way of saying: "The playing field isn't flat because the players started on different hills."

The Experiment: The Medical AI Test

To figure out how regular people (not just math experts) feel about this, the researchers ran a study with 85 people. They created a fake scenario:

The Setup: An AI is trying to predict who has cancer.
The Groups: Two different racial groups (labeled Race A and Race B).
The Twist: The researchers told the participants different things about the data the AI was trained on.

They asked participants to rate three different AI models:

The "Super Model": It forces both groups to have the same high accuracy (even if it means guessing wrong more often for the group that was naturally easier to predict).
The "Compromise Model": It forces both groups to have the same average accuracy.
The "Realist Model": It lets Group A have high accuracy and Group B have lower accuracy, exactly as the data showed they naturally performed.

The Surprising Results

The researchers expected people to always want the "Super Model" (Option 1) because it sounds the most equal. But the results were much more nuanced.

1. When the Groups Look the Same (or we know nothing)

If the researchers didn't tell the participants that the groups were different, or if the groups performed equally well on their own, people loved the "Super Model."

Analogy: If you have two runners who look identical and have the same shoes, you expect them to finish the race at the same time. If one finishes way ahead, you think the race was rigged.

2. When the Groups Are Naturally Different

Here is the big discovery: When participants knew that one group was naturally harder to predict (or had less data), they actually preferred the "Realist Model" (Option 3).

They thought it was unfair to force the AI to pretend the groups were the same if they weren't.

Analogy: Imagine a basketball coach.
- Group A is a team of professional NBA players.
- Group B is a team of 10-year-olds.
- If the coach forces the AI to predict that both teams will score 100 points, the AI will be wrong about the kids (predicting they will score 100 when they won't) and wrong about the pros (predicting they will score 100 when they might score 120).
- The Participants' View: They said, "It's fair to predict the pros will score high and the kids will score low. That's just reality. If you force the AI to say 'both teams score 50,' you are lying about the kids' potential and the pros' skill."

3. The Role of "Data Availability"

The study also looked at why the groups were different.

If Group A had more data (more training examples) and performed better, people thought, "Okay, that makes sense. They had more practice."
But if Group A had less data and still performed better, people got suspicious. They thought, "Wait, if they had less practice but still won, maybe the AI is biased against Group B?"

The "Anchor" Effect

The paper found that people don't judge fairness in a vacuum. They use Anchoring.

The Metaphor: Imagine you are buying a car.
- If a car usually costs $20,000, and you see one for $18,000, you think it's a great deal.
- If a car usually costs $10,000, and you see one for $12,000, you think it's a rip-off.
In the Study: People used the "natural performance" of each group as the anchor. If Group B naturally struggled, people thought it was fair for the AI to struggle with them too. If the AI suddenly started treating them perfectly (ignoring the natural struggle), people thought that was actually unfair because it ignored the reality of the situation.

Why Does This Matter?

Currently, many AI systems are built to force "Equality of Outcome" (making sure everyone gets the same score). This paper argues that this can backfire.

The Risk: If you force an AI to ignore real differences between groups (like different disease rates or different data quality), you might end up making bad decisions. You might release dangerous criminals because you forced the AI to predict they are safe just to match a statistic. Or you might deny medical treatment to people who actually need it because the AI is trying to "balance the books."
The Solution: We need AI that understands Context.
- If the difference is caused by bias (unfair data collection), we should fix it.
- If the difference is caused by reality (different base rates or task difficulty), we should respect it.

The Takeaway

Fairness isn't just about making the numbers look equal. It's about understanding why the numbers are different.

If you treat two different groups exactly the same when they are fundamentally different, you aren't being fair; you are being blind. True fairness means acknowledging the reality of the situation and making decisions that respect those differences, rather than forcing a false equality that hurts everyone.

Here is a detailed technical summary of the paper "Measuring Perceptions of Fairness in AI Systems: The Effects of Infra-marginality."

1. Problem Statement

The paper addresses the infra-marginality problem, a phenomenon where disparities in machine learning model performance across demographic groups arise from legitimate differences in underlying data distributions (e.g., base rates) rather than algorithmic bias.

The Conflict: Standard group fairness definitions (e.g., demographic parity, equalized error rates) often assume that statistical parity across groups is the ideal outcome. However, when groups have different base rates (e.g., different prevalence of a disease or recidivism), forcing parity can lead to suboptimal or harmful decisions (e.g., releasing high-risk individuals or denying parole to low-risk individuals).
The Gap: While technical literature acknowledges infra-marginality, there is a lack of empirical understanding regarding how human users perceive fairness in these contexts. Do users view performance disparities as unfair discrimination, or do they accept them as legitimate reflections of data distribution? Current fairness metrics often fail to align with human reasoning when distributional differences are present.

2. Methodology

The authors conducted a quantitative user study to investigate how information about data distributions influences fairness judgments.

Participants: 85 participants recruited from a technical background (MIT EECS, graduate students, LinkedIn data science groups). Demographics were diverse in age (21–42) and gender, with a majority holding advanced degrees.
Scenario: A hypothetical medical AI system for cancer prediction involving two racial groups (labeled "Race A" and "Race B" to avoid specific societal biases).
Experimental Design:
- Treatment 1: Group-Specific Performance. Participants were shown the accuracy of models trained separately on each group. Seven instances were tested:
  - No information.
  - Equal performance (90/90, 70/70).
  - Unequal performance with a 10% gap (e.g., 95/85, 75/65, 85/95, 65/75).
- Treatment 2: Data Availability. Participants were informed about the relative amount of training data available for each group (Unspecified, Equal, 3x more for A, 20x more for A).
- Task: For each scenario, participants rated the fairness (1–7 Likert scale) of three candidate models:
  1. Option 1 (Max Parity): Equalizes performance to the higher group's accuracy.
  2. Option 2 (Average Parity): Equalizes performance to the average of the two groups.
  3. Option 3 (Preserve Disparity): Maintains the observed performance differences between groups (matching the group-specific baselines).
Controls: The study included repeated scenarios for data validity checks and randomized question orders. Statistical analysis utilized Independent T-Tests.

3. Key Contributions

Empirical Evidence on Infra-marginality: The study provides the first systematic user-level evidence that fairness perceptions are not binary (fair/unfair) but are highly context-dependent on the cause of disparities.
Decoupling Parity from Fairness: It challenges the assumption that statistical parity is always the desired outcome. The authors demonstrate that users often view strict parity as unfair when disparities are perceived as justified by data distribution.
The Role of Data Availability: The paper highlights that users integrate knowledge of data quantity (training set size) into their fairness judgments, distinguishing between disparities caused by "task difficulty" (legitimate) versus "biased sampling" (illegitimate).
Anchoring Effect in Fairness: The study identifies that users evaluate fairness relative to a baseline (the group-specific performance) rather than in absolute terms, suggesting an "anchoring" cognitive bias in fairness reasoning.

4. Key Results

Preference for Parity in Absence of Context: When group-specific performance data was equal or unavailable, participants significantly preferred Option 1 and Option 2 (models that equalized outcomes). This aligns with standard group fairness definitions.
Preference for Preserving Disparities in Presence of Context: When group-specific performances were unequal, participants significantly preferred Option 3 (preserving the observed differences).
- This preference held even when Option 1 offered higher overall accuracy.
- Participants viewed the preservation of disparities as fairer when the differences were consistent with the provided data context.
Impact of Data Availability:
- When a group had higher performance and more data, users did not automatically view the disparity as fair.
- However, when a group had higher performance but less data, users were more likely to interpret the disparity as a result of "task difficulty" (legitimate infra-marginality) rather than bias, reinforcing the preference for Option 3.
Statistical Significance: The preference for Option 3 over Option 2 was statistically significant ( $p < 0.05$ to $p < 0.001$ ) in scenarios with unequal group performances.

5. Significance and Implications

Redefining Fairness Metrics: The findings suggest that widely adopted group fairness metrics (like equalized error rates) may be misaligned with human expectations in real-world applications where base rates differ. Enforcing parity in these contexts can be perceived as unfair by stakeholders.
System Design Recommendations:
- Context-Aware Fairness: Fairness interventions should not blindly enforce parity. Instead, systems should account for distributional context and the perceived legitimacy of disparities.
- Transparency: Explaining why disparities exist (e.g., "Group A has a harder task due to lower data prevalence") is crucial for aligning algorithmic outputs with human trust.
Beyond Group Fairness: The results imply that as we move toward subgroup or individual fairness, rigid parity constraints become increasingly unrealistic. Future frameworks must balance statistical rigor with the "social legitimacy" of the decision-making process.
Risk of Harm: The authors warn that ignoring infra-marginality and forcing parity can lead to "backfire" effects, where well-intentioned fairness interventions actually harm the populations they aim to protect by ignoring legitimate statistical realities.

In conclusion, the paper argues that fairness is not just a mathematical property of the output, but a social judgment influenced by the perceived causes of data disparities. To build trustworthy AI, developers must bridge the gap between formal metrics and human reasoning regarding distributional context.