Fairness under Graph Uncertainty: Achieving Interventional Fairness with Partially Known Causal Graphs over Clusters of Variables

Imagine you are a hiring manager at a company. You want to pick the best candidate for a job, but you also want to make sure you aren't discriminating against people based on their gender or race. This is the classic problem of Algorithmic Fairness.

Usually, computers make these decisions by looking at a person's resume (features) and predicting if they will be good at the job. But here's the catch: sometimes a resume contains hidden clues about a person's race or gender that lead to unfair bias, even if the computer doesn't "know" it's doing so.

To fix this, researchers use Causal Graphs. Think of a causal graph as a family tree of cause-and-effect. It maps out how one thing leads to another. For example:

Gender $\rightarrow$ Education $\rightarrow$ Job Score.
If the computer sees "Education" and "Job Score," it might unfairly penalize someone because their gender influenced their education. To be truly fair, we need to "intervene" in this family tree—imagine magically changing a person's gender in a simulation to see if their job score would still be the same. If the score changes, the system is unfair.

The Problem: The Map is Missing

The big issue with current methods is that they require a perfect, detailed map of this family tree. They need to know exactly how every single variable connects to every other variable.

The Reality: In the real world, we rarely have this perfect map. We might know that "Education" and "Job Score" are related, but we don't know the exact path. Trying to draw the whole map from scratch is like trying to map every single street in a massive city just by looking at a few satellite photos. It's slow, expensive, and often leads to mistakes.

The Solution: The "Cluster" Shortcut

This paper proposes a clever workaround. Instead of trying to map every single street (variable), the authors suggest grouping related streets into neighborhoods (clusters).

The Old Way: Try to map every single house and street in the city. (Hard, slow, prone to errors).
The New Way: Group the city into neighborhoods. Map the roads between neighborhoods. (Much easier, faster, and more robust).

In this paper, the "neighborhoods" are groups of variables (like grouping "Education" and "Work Experience" into one "Background" cluster). The authors show that figuring out the connections between these clusters is much easier than figuring out connections between individual variables.

How It Works: The "Worst-Case" Safety Net

Since the map of the neighborhoods isn't perfect (we don't know exactly what's happening inside every neighborhood), the authors use a safety-first strategy.

Guess the Paths: They look at their "cluster map" and list all the possible ways the bias could travel from the sensitive attribute (like gender) to the decision.
The "What-If" Test: They simulate the hiring process for every possible path on their list.
The Penalty: If the computer makes a decision that looks unfair in even one of these possible scenarios, the system gets a "penalty." It's like a teacher grading a student: if the student's answer is wrong in any possible interpretation of the question, they lose points.
The Fix: The computer learns to adjust its decisions until it passes the fairness test for all possible scenarios simultaneously.

The Secret Sauce: The "Barycenter"

To make this math work fast, they invented a new way to measure "unfairness" called the Barycenter Kernel MMD.

The Analogy: Imagine you have 100 different groups of people (based on race, gender, etc.). Instead of comparing every single person to every other person (which takes forever), you find the "average person" (the barycenter) for each group.
Then, you just measure how far each group's "average person" is from the "grand average."
This is like comparing the average height of a basketball team to the average height of a gymnastics team, rather than measuring every single player against every other player. It's much faster and scales well even when you have many different groups to check.

The Results

The authors tested this on fake data and real-world datasets (like credit scoring and hiring).

Accuracy vs. Fairness: Usually, making a system fairer makes it less accurate (like a strict teacher who fails everyone to be "fair").
The Winner: Their method found the sweet spot. It was fairer than other methods that tried to guess the full map, and it was more accurate than methods that just ignored the sensitive data entirely.

In a Nutshell

This paper teaches us that when we don't have a perfect map of the world, we shouldn't try to draw one from scratch. Instead, we should group things together, map the big picture, and then use a safety net to ensure that no matter how the details turn out, our decisions remain fair. It's a smarter, faster, and more practical way to build AI that doesn't discriminate.

1. Problem Statement

Algorithmic decision-making systems (e.g., hiring, lending) must be fair with respect to sensitive attributes (e.g., gender, race). Interventional fairness is a causality-based notion requiring that the distribution of predictions remains invariant under interventions on sensitive attributes, provided that "admissible" features (those not causing discrimination) are held constant.

However, achieving interventional fairness typically requires knowledge of the ground-truth causal graph (Directed Acyclic Graph or DAG) over individual variables. In practice, learning a full variable-level causal graph from observational data is:

Computationally expensive: It requires a massive number of conditional independence tests, which scales poorly with dimensionality.
Error-prone: Estimation errors accumulate, leading to unreliable fairness guarantees.
Unrealistic: It assumes perfect domain knowledge or restrictive functional assumptions about data generation.

Existing methods that use Partially Directed Acyclic Graphs (CPDAGs) still struggle in high-dimensional settings because estimating a variable-level CPDAG remains difficult. The paper addresses the gap: How can we achieve interventional fairness when we only have access to a causal graph defined over clusters of variables (a Cluster CPDAG), where the internal structure of each cluster is unknown?

2. Methodology

The authors propose a learning framework (C-IFair) that achieves interventional fairness using a Cluster CPDAG. This graph represents variables grouped into clusters (e.g., "Education," "Medical History") and includes specific annotations (independence arcs, connection/separation marks) to handle uncertainty within clusters.

The methodology consists of three core components:

A. Adjustment Set Enumeration under Uncertainty

To enforce interventional fairness, one must identify adjustment sets (variables to condition on) to block back-door paths from sensitive attributes to the prediction.

Challenge: A Cluster CPDAG represents a Markov Equivalence Class (MEC) of many possible DAGs. A single adjustment set valid for one DAG might be invalid for another.
Solution: The authors develop a graphical algorithm to enumerate a set of candidate adjustment cluster sets $\{Z_1, \dots, Z_M\}$ ${Z_{1}, \dots, Z_{M}}$ .
- The algorithm leverages independence arcs (marg, cond, never) and connection/separation marks defined in the Cluster CPDAG.
- It performs a two-step process:
  1. Parent Enumeration: Identifies possible parent clusters of the sensitive attribute $A$ based on collider structures and independence arcs.
  2. Adjustment Completion: Iteratively augments these parent sets with additional clusters required to d-separate back-door paths, handling cases where true descendants are unidentifiable via a "graph refinement" step (splitting clusters into singletons if necessary).
- The algorithm guarantees that for any true cluster DAG compatible with the observed Cluster CPDAG, at least one set in $\{Z_1, \dots, Z_M\}$ is a valid adjustment set.

B. Worst-Case Fairness Penalty

Since the true adjustment set is unknown, the framework adopts a min-max approach. It penalizes the worst-case unfairness across all enumerated adjustment sets.

Objective: Minimize prediction error + $\lambda \times \max_{m} (\text{Discrepancy}_m)$ .
Discrepancy Metric: The discrepancy is measured using the Maximum Mean Discrepancy (MMD) between interventional distributions $P(\hat{Y} | do(A=a), do(X_{ad}=x_{ad}))$ for different sensitive values $a$ .

C. Efficient Barycenter Kernel MMD

Computing the worst-case MMD over all pairs of sensitive values and all adjustment sets is computationally prohibitive ( $O(M N_A^2 n^2)$ ). The authors introduce two optimizations:

Barycenter Decomposition: Instead of summing pairwise MMDs, they decompose the sum into the sum of MMDs between each distribution and a barycenter distribution (the mixture of all distributions). This reduces complexity from $O(N_A^2)$ to $O(N_A)$ .
Random Fourier Features (RFF): They approximate the kernel mean embedding using RFFs, reducing the complexity of computing each MMD from $O(n^2)$ to $O(nd_{RFF})$ .

Result: The final penalty function scales efficiently with the number of sensitive values ( $N_A$ ) and sample size ( $n$ ).

3. Key Contributions

Cluster-Level Framework: The first framework to achieve interventional fairness using Cluster CPDAGs, relaxing the need for variable-level causal knowledge. This significantly reduces the number of required conditional independence tests.
Novel Enumeration Algorithm: A graphical algorithm that explicitly handles independence arcs and connection/separation marks to enumerate valid adjustment cluster sets, ensuring robustness against graph uncertainty.
Computationally Efficient Penalty: The development of a barycenter kernel MMD estimator that scales linearly with the number of sensitive attribute values and sample size, making the approach feasible for high-dimensional data.
Theoretical & Empirical Validation: Proofs for the enumeration strategy and extensive experiments demonstrating superior performance over existing baselines.

4. Experimental Results

The method was evaluated on synthetic and real-world datasets (Adult, German Credit, OULAD) against baselines like $\epsilon$ -IFair, $\ell$ -IFair, and an Oracle (using the true DAG).

Synthetic Data:
- C-IFair achieved the best trade-off between Root Mean Squared Error (RMSE) and Unfairness across linear and nonlinear datasets.
- In high-dimensional settings ( $d=15$ ), C-IFair significantly outperformed $\ell$ -IFair (which relies on variable-level CPDAGs), demonstrating that variable-level inference becomes unreliable as dimensionality increases.
- It remained robust even when the cluster partition was inadmissible (violating acyclicity assumptions) or when graphs were dense.
Real-World Data:
- On the Adult and German Credit datasets, C-IFair achieved higher AUC (accuracy) and lower unfairness compared to all non-oracle baselines.
- It successfully reduced the probability difference of predictions across sensitive groups (e.g., gender/race) while maintaining predictive power.
Efficiency: The CLOC algorithm used for inferring Cluster CPDAGs was significantly faster and more stable than the PC algorithm used for variable-level inference, especially in high dimensions.

5. Significance

This paper addresses a critical bottleneck in causal fairness: the reliance on perfect or near-perfect causal graph knowledge.

Practicality: By shifting the causal inference burden from the variable level to the cluster level, the method makes interventional fairness achievable in real-world scenarios where domain knowledge is limited to high-level groupings of variables.
Robustness: It provides a principled way to handle graph uncertainty without sacrificing fairness guarantees, using a worst-case optimization strategy.
Scalability: The proposed computational optimizations allow the method to scale to datasets with many sensitive attribute values and large sample sizes, which was previously a barrier for MMD-based causal fairness methods.

In summary, the paper offers a robust, scalable, and theoretically grounded solution for deploying fair AI systems in environments where the underlying causal structure is only partially known.