Procedural Fairness in Machine Learning

Imagine you are applying for a loan at a bank. You get rejected. You might feel angry, but your reaction depends heavily on why you were rejected.

Scenario A (Distributive Unfairness): You are rejected because the bank has a rule: "We only lend to people with red hair." You have brown hair, so you are rejected. This is unfair because the outcome is biased against a specific group.
Scenario B (Procedural Unfairness): You are rejected because the bank's algorithm looks at your credit score, your job history, and your age. However, the algorithm secretly weighs your zip code so heavily that it effectively discriminates against people from your neighborhood, even though "zip code" isn't an official rule. The process itself is rigged, even if the final "No" looks like a normal decision.

For years, researchers have been obsessed with Scenario A (making sure the outcomes are fair). This paper argues that we are ignoring Scenario B (making sure the decision-making process is fair). The authors call this Procedural Fairness.

Here is a simple breakdown of what this paper does, using everyday analogies.

1. The Problem: The "Black Box" Judge

Machine Learning (ML) models are like Black Box Judges. They take in information (your resume, your credit score) and spit out a decision (Hire/No Hire, Loan/No Loan).

Old Way: We only checked the verdict. "Did the judge reject too many women?" If yes, we fix the outcome.
The Gap: We didn't check how the judge thought. Did the judge ignore the woman's qualifications and focus entirely on her name? We couldn't see inside the judge's head.

The authors say: "It's not enough to just get a fair result. The way the computer thinks must be fair too."

2. The Solution: The "X-Ray Vision" (FAE)

To see inside the Black Box, the authors use a technique called Feature Attribution Explanation (FAE). Think of this as X-Ray Vision or a Magnifying Glass for the computer's brain.

When the computer makes a decision, this tool highlights exactly which pieces of information mattered most.

Example: If the computer rejects a loan, the X-Ray might show: "I rejected this because of Credit Score (90% importance) and Age (10% importance)."

3. The New Metric: GPFFAE (The "Process Fairness Score")

The authors created a new score called GPFFAE. Here is how it works:

Imagine you have two groups of people: Group A (e.g., Men) and Group B (e.g., Women). You find two people who are almost identical in every way (same job, same salary, same credit score), but one is from Group A and one is from Group B.

Fair Process: The computer looks at both of them and says, "Ah, I care about your Salary and Job." It uses the same logic for both.
Unfair Process: The computer looks at Group A and says, "I care about your Salary." But for Group B, it says, "I care about your Name."

GPFFAE measures the difference between these two "thought processes." If the computer uses different "rules of the road" for different groups, even if they are similar people, the GPFFAE score will be low, signaling Procedural Unfairness.

4. The Discovery: Fair Process vs. Fair Outcome

The authors ran experiments and found something surprising: A computer can have a fair outcome but an unfair process, and vice versa.

The Counter-Intuitive Finding: Sometimes, a computer might reject a lot of people from a minority group (unfair outcome), but it does so by looking at the exact same factors for everyone (fair process).
Why it matters: In real life, people are more willing to accept a bad outcome if they believe the process was fair. If you know the judge used the same rules for everyone, you are less likely to feel discriminated against, even if you lost the case.

5. The Fix: Cleaning the "Rotten Apples"

Once they identified that a model was using an unfair process, they needed to fix it. They found the specific "ingredients" (features) causing the bias—like a bad apple in a barrel.

They proposed two ways to fix it:

Method 1: The "Surgery" (Retraining)
- What they did: They took the "bad apples" (the unfair features) out of the data entirely and taught the computer to learn again from scratch without them.
- Result: The computer became very fair, but it had to relearn everything, which took time and slightly lowered its overall accuracy.
Method 2: The "Fine-Tuning" (Modification)
- What they did: Instead of retraining, they gently nudged the existing computer. They told it, "Hey, stop paying so much attention to that specific feature."
- Result: This was faster and kept the computer's original "personality" (decision logic) mostly intact, though it required careful balancing so the computer didn't get confused.

The Big Takeaway

This paper is like a new rulebook for AI ethics. It says:

"Don't just check if the AI gives the right answer. Check how it got there. If the AI is using a secret, biased rulebook to make decisions, it's not fair, even if the final numbers look okay."

By using their new "X-Ray" tool, we can spot these hidden biases in the decision-making process and fix them, ensuring that AI treats everyone with the same logic, not just the same result.

1. Problem Statement

While fairness in Machine Learning (ML) has garnered significant attention, existing research predominantly focuses on distributive fairness (outcome fairness), which evaluates whether different groups receive similar outcomes. There is a critical gap in addressing procedural fairness (process fairness), which concerns the fairness, transparency, and equity of the decision-making process itself.

Current limitations include:

Lack of Definition: Existing definitions (e.g., Grgić-Hlača et al., 2018) often equate procedural fairness with the inherent fairness of input features. The authors argue this is insufficient; a model using "unfair" features (like sensitive attributes) does not necessarily imply an unfair decision process if the logic treats similar individuals equitably.
Lack of Metrics: There are no standardized, quantitative metrics to measure procedural fairness in ML models, unlike the well-established metrics for distributive fairness (e.g., Demographic Parity, Equal Opportunity).
Evaluation Challenges: Previous attempts relied on human moral judgments, which are costly, non-scalable, and inconsistent.

2. Methodology

The paper proposes a systematic framework to define, measure, detect, and mitigate procedural unfairness.

A. Formal Definitions

Drawing from philosophy and psychology, the authors define procedural fairness as the absence of prejudice in the model's internal decision logic.

Individual Procedural Fairness: Similar data points should undergo similar decision processes.
Group Procedural Fairness: Similar data points from different sensitive groups (e.g., male vs. female) should undergo similar decision processes.

B. The GPFFAE Metric (Measurement)

To quantify procedural fairness, the authors introduce GPFFAE (Feature Attribution-based Group Procedural Fairness).

Core Mechanism: It utilizes Feature Attribution Explanations (FAE) (e.g., SHAP, Integrated Gradients) to capture the model's decision logic. For a given input, FAE produces a vector of feature importance scores.
Process:
1. Select $n$ pairs of similar data points from two different groups ( $D_1$ and $D_2$ ).
2. Generate FAE explanations ( $E_1$ and $E_2$ ) for these points.
3. Calculate the distributional difference between $E_1$ and $E_2$ using Maximum Mean Discrepancy (MMD) with a permutation test.
Interpretation: A low MMD $p$ -value (indicating significant distributional difference) implies procedural unfairness. A high $p$ -value implies the decision logic is consistent across groups.

C. Identifying Sources of Unfairness

If a model is deemed procedurally unfair, the authors propose a detection method to identify specific Unfair Features (UFs).

They compare the distribution of importance scores for each individual feature between the two groups.
Features where the distribution difference is statistically significant are flagged as UFs. These are the specific inputs driving the procedural bias.

D. Mitigation Strategies

Two methods are proposed to improve procedural fairness based on the identified UFs:

Retraining (Elimination): Remove the detected UFs from the input data and retrain the model from scratch.
Model Modification (Penalty): Keep the original model but optimize it with an additional explanation loss term ( $\zeta$ ). This loss penalizes the model for assigning high importance to the UFs, effectively forcing the model to reduce the influence of these features on its decision logic without full retraining.

3. Key Contributions

Conceptual Definition: Provided a rigorous, formal definition of procedural fairness in ML that focuses on the decision logic rather than just input features or outcomes.
Novel Metric (GPFFAE): Developed the first quantitative, automated metric for group procedural fairness using XAI techniques (FAE), eliminating the need for human judgment.
Detection & Mitigation Framework: Created a pipeline to detect specific features causing procedural unfairness and proposed two effective mitigation strategies (Retraining and Model Modification).
Empirical Validation: Validated the approach on 9 datasets (1 synthetic, 8 real-world including Adult, COMPAS, German Credit, etc.), demonstrating the ability to distinguish fair/unfair models and improve fairness.

4. Experimental Results

The authors evaluated their methods on synthetic and real-world datasets using Artificial Neural Networks (ANNs) and Logistic Regression (LR).

Metric Effectiveness: GPFFAE successfully distinguished between procedurally fair and unfair models.
- On constructed fair models, GPFFAE values were close to 1.0 (indicating high fairness).
- On constructed unfair models, values dropped to near 0.0.
Relationship with Distributive Fairness: The study revealed that procedural and distributive fairness are not always correlated. A model can be distributively unfair (biased outcomes) but procedurally fair (consistent logic), and vice versa.
Mitigation Performance:
- Retraining: Removing UFs significantly improved procedural fairness (GPFFAE $\approx$ 1.0) and often improved distributive fairness, with a negligible drop in accuracy (average 0.8%).
- Model Modification: Reducing the impact of UFs via the penalty term also significantly improved procedural fairness and distributive fairness, with a slightly higher accuracy cost (average 1.8%).
- Trade-off: The modification method allows for fine-grained control over the trade-off between performance and fairness via the hyperparameter $\alpha$ .
Limitations & Solutions: The metric struggles with small datasets where finding similar pairs is difficult. The authors proposed a Counterfactual Data Generation strategy (using Kernel Density Estimation) to synthesize similar data points, improving accuracy on sparse datasets like the German Credit dataset.

5. Significance

This paper is significant because it shifts the focus of AI fairness from purely outcome-based metrics to process-based metrics.

Trust and Transparency: By evaluating the logic of the model rather than just the result, stakeholders can better trust AI systems, even if outcomes vary, provided the reasoning is consistent.
Regulatory Alignment: The work aligns with emerging AI governance frameworks (e.g., EU AI Act) that emphasize transparency and explainability, offering a technical tool to audit the "black box" decision processes.
Actionable Insights: Unlike previous theoretical discussions, this paper provides a concrete, automated method to not only detect that a model is procedurally unfair but why (specific features) and how to fix it.

In conclusion, Wang et al. establish a foundational framework for procedural fairness in ML, proving that it is possible to quantify, detect, and mitigate biases in the decision-making process itself, complementing existing efforts in outcome fairness.

Procedural Fairness in Machine Learning

1. The Problem: The "Black Box" Judge

2. The Solution: The "X-Ray Vision" (FAE)

3. The New Metric: GPFFAE (The "Process Fairness Score")

4. The Discovery: Fair Process vs. Fair Outcome

5. The Fix: Cleaning the "Rotten Apples"

The Big Takeaway

1. Problem Statement

2. Methodology

A. Formal Definitions

B. The GPFFAE Metric (Measurement)

C. Identifying Sources of Unfairness

D. Mitigation Strategies

3. Key Contributions

4. Experimental Results

5. Significance

More like this

Complexity of Classical Acceleration for ℓ1\ell_1ℓ1​-Regularized PageRank

MapTab: Are MLLMs Ready for Multi-Criteria Route Planning in Heterogeneous Graphs?

Language Guided Adversarial Purification

Graph-based Active Learning for Entity Cluster Repair

Neural Green's Operators for Parametric Partial Differential Equations

Complexity of Classical Acceleration for $\ell_1$ -Regularized PageRank