Original authors: Yingqi Zhao, Vasilis Efthymiou, Jyrki Nummenmaa, Kostas Stefanidis

Published 2026-05-18✓ Author reviewed ⓘ

📖 4 min read☕ Coffee break read

Original authors: Yingqi Zhao, Vasilis Efthymiou, Jyrki Nummenmaa, Kostas Stefanidis

Original paper licensed under CC BY 4.0 (http://creativecommons.org/licenses/by/4.0/). ✨ This is an AI-generated explanation of the paper below. It is not written by the authors. For technical accuracy, refer to the original paper. Read full disclaimer

Imagine you have a very smart but sometimes biased assistant (a Large Language Model) who is great at writing stories and answering questions. However, this assistant sometimes makes things up or leans too heavily toward one side of an argument. To fix this, you give the assistant a library of books (Retrieval-Augmented Generation, or RAG) to read before answering. The idea is that the books will provide the facts, and the assistant will just summarize them.

But here's the catch: The librarian who picks the books is also biased. If the librarian only hands the assistant books from one political party or only about men, the assistant will write answers that are biased, even if the assistant itself is trying to be fair.

This paper proposes a new way to be the "Librarian" to ensure the assistant gives fair answers. Here is how they do it, broken down into three simple steps:

1. The "Controlled Mix" (Stage 1)

Imagine you have two piles of books: one pile has "Left-leaning" views, and the other has "Right-leaning" views (or one pile is about men, the other about women).

The Old Way: You just grab the top 5 books that seem most relevant. If the top 5 happen to be all from the "Left" pile, your answer will be biased.
The New Way: The authors introduce a "mixing machine" (a reranker). Before handing the books to the assistant, this machine deliberately shuffles them. It ensures that if you ask for 5 books, you might get 3 from the Left pile and 2 from the Right, or vice versa. It gives you precise control over the mix of opinions in the stack, without needing to rewrite the books themselves.

2. The "Seat at the Table" (Stage 2)

The researchers discovered something interesting: It matters where the books are placed in the stack.
Think of the stack of books as a row of people sitting at a long table. The assistant (the AI) pays more attention to the people sitting at the head of the table than the people at the very end.

They ran experiments to see how much influence each "seat" (position 1, position 2, etc.) has on the final answer.
They found a simple, straight-line relationship: If you put a "Right-leaning" book in seat #1, it pulls the answer strongly to the right. If you put it in seat #5, it pulls the answer much less.
They built a mathematical model (a "bias propagation map") that predicts exactly how much the final answer will be swayed based on which books are in which seats.

3. The "Fairness Optimizer" (Stage 3)

Now that they know how to mix the books and how much each seat matters, they created a smart calculator (called FARO) to solve the ultimate puzzle.

The Goal: Pick the best 5 books that are most relevant to the question AND ensure the final answer isn't biased.
The Problem: If you try to check every possible combination of books for every question, it takes forever (like trying to solve a giant Sudoku puzzle for every single question).
The Solution (FARO): The authors invented a shortcut. Instead of solving one giant, impossible puzzle, they broke it down into many small, easy puzzles (one for each question). They use a clever math trick to turn the "fairness" requirement into a simple adjustment.
The Result: The system quickly finds the perfect mix of books. It might sacrifice a tiny bit of "perfect relevance" (picking the absolute best book) to ensure the final answer is perfectly balanced between the two groups.

The Bottom Line

The paper shows that by carefully controlling which documents are retrieved and where they are placed in the list, you can stop the AI from being biased without needing to retrain the AI itself.

What they proved: Their method works on different types of AI models and for different topics (like politics and gender).
The Trade-off: You can choose how strict you want to be. You can say, "I want the answer to be 100% fair," or "I want it to be mostly fair but keep the relevance high." Their tool lets you slide between these options easily.
The Limit: If the AI itself is extremely biased (like a person who refuses to listen to the other side no matter what), the tool can only do so much. But for most cases, it successfully balances the scales.

In short, they built a "Fair Librarian" that knows exactly how to arrange the books on the shelf so the AI reads a balanced story.

Technical Summary: Fairness-Aware Retrieval Optimization for Retrieval-Augmented Generation

1. Problem Statement

Retrieval-Augmented Generation (RAG) enhances Large Language Models (LLMs) by incorporating external knowledge, yet the retrieval process itself can introduce or amplify bias that propagates to the final generated output. While existing research addresses bias in LLMs (via prompt engineering or fine-tuning) and fairness in ranking systems (via exposure constraints), these domains remain largely disconnected. A critical gap exists in understanding how bias propagates from retrieved documents to generated outputs, particularly in top-k RAG settings where multiple documents jointly influence generation.

Current approaches often rely on black-box embedding models or fine-tuning, which are costly and difficult to control precisely. Furthermore, prior work on bias propagation has largely been limited to top-1 settings, assuming a linear relationship between a single document's bias and the output. This assumption fails to capture the complex, position-dependent interactions inherent in top-k retrieval, where documents at different ranks exert varying levels of influence on the LLM's generation. The core challenge is to design a retrieval strategy that balances relevance with fairness (statistical parity in generated outputs) without compromising the quality of the retrieved context.

2. Methodology

The authors propose a unified, three-stage framework for fairness-aware retrieval optimization in top-k RAG systems.

Stage 1: Controlled Bias Injection via Reranking

Instead of modifying the underlying retriever or fine-tuning embedding models, the framework employs a reranker-based mechanism to control the bias of retrieved documents.

Mechanism: The knowledge base is partitioned into group-specific subsets (e.g., liberal vs. conservative, male vs. female). For a given query, candidate documents are retrieved from these subsets.
Control: A probabilistic reranker selects and orders documents based on a parameter $m$ , which dictates the probability of choosing a document from a specific group. This allows for precise manipulation of the embedding bias ( $E_b$ ) at each position $p$ in the top- $k$ list, denoted as $E_b^p$ , without altering the base retrieval model.

Stage 2: Position-Aware Bias Propagation Modeling

The framework models how bias propagates from the retrieved context to the final output.

Linear Approximation: Building on the observation that bias propagation is approximately linear in top-1 settings, the authors extend this to top- $k$ by assuming additivity and conditional independence. The system-level output bias ( $R_b$ ) is modeled as a weighted sum of position-wise embedding biases:
$R_b = \sum_{p=1}^{k} w_p \cdot E_b^p + L_b + \epsilon$
Where $w_p$ represents the position-dependent weight (sensitivity of the LLM to bias at rank $p$ ), $L_b$ is the intrinsic bias of the generator, and $\epsilon$ is a residual term.
Estimation: The weights $w_p$ are estimated via controlled perturbations. By systematically varying the bias values at different positions and measuring the resulting output bias, a linear regression model is fitted to capture the specific attention patterns of different LLMs.

Stage 3: Fairness-Aware Retrieval Optimization (FARO)

The final stage formulates retrieval as an optimization problem to balance relevance and fairness.

Objective: Maximize total relevance while ensuring the system-level bias $|R_b|$ remains within a predefined tolerance $\tau$ .
Challenge: A direct formulation leads to a combinatorial problem that is computationally expensive and couples all questions, preventing parallelization.
Solution (FARO): The authors introduce Quadratic Fairness via Dual Hyperplane Approximation (FARO).
- They reformulate the hard fairness constraint into a soft objective using a quadratic penalty term ( $-\lambda R_b^2$ ).
- Using the Fenchel–Legendre dual representation, the quadratic term is approximated by a family of linear surrogates parameterized by $\theta$ (or $\mu$ ).
- This transformation decomposes the global, coupled optimization problem into independent per-question subproblems. Each subproblem is a standard linear assignment problem solvable efficiently via the Hungarian algorithm.
- By enumerating a set of $\mu$ values, the framework generates a Pareto frontier of solutions, allowing practitioners to select the optimal trade-off between relevance and fairness.

3. Key Contributions

Controlled Bias Injection Pipeline: A reranker-based approach that enables precise manipulation of embedding bias in retrieved documents without modifying the underlying retriever or requiring expensive fine-tuning.
Position-Aware Bias Propagation Model: A linear model that captures how documents at different retrieval positions jointly influence generation bias in top-k RAG systems, extending previous top-1 analyses.
Scalable Optimization Framework (FARO): A novel formulation that transforms a globally coupled fairness optimization problem into independent subproblems, enabling efficient computation and flexible exploration of the relevance–fairness trade-off.
Comprehensive Evaluation: Extensive experiments across multiple models (Llama, Gemma, Mistral, Qwen) and bias types (political, gender) validating the linear propagation model and the effectiveness of the optimization framework.

4. Experimental Results

The framework was evaluated on political and gender bias datasets using four open-source LLMs.

Bias Propagation Validation: Experiments confirmed a strong linear relationship between position-wise embedding bias and output bias across different models and $k$ values (top-2, top-3, top-5). The learned weights ( $w_p$ ) revealed model-specific attention patterns (e.g., some models prioritize early positions, while others distribute attention more evenly).
Optimization Performance:
- Effectiveness: The FARO framework effectively mitigated generation bias, bringing output bias scores close to zero while maintaining competitive relevance.
- Scalability: Compared to a Linear Programming (LP) baseline, FARO demonstrated superior scalability, particularly as the number of documents and questions increased. While LP performance degraded with larger $k$ , FARO maintained efficiency by decomposing the problem.
- Flexibility: FARO could generate multiple candidate solutions along the relevance–fairness frontier, allowing for dynamic adjustment to changing fairness constraints without re-running the entire optimization.
Limitations Observed: The effectiveness of bias mitigation was found to be dependent on the intrinsic bias of the underlying LLM. Models with strong inherent biases (e.g., Qwen) showed limited improvement, as retrieval alone could not fully correct the systematic offset. Additionally, in gender bias settings with skewed knowledge bases, the trade-off between fairness and relevance was more pronounced due to a lack of candidate documents for the underrepresented group.

5. Significance and Claims

The paper claims to provide a principled and scalable approach for fairness-aware retrieval in RAG systems. Its significance lies in:

Decoupling Bias Control from Retrieval: Offering a lightweight post-processing mechanism that does not require retraining retrieval models.
Bridging the Gap: Connecting the fields of LLM bias and fairness-aware ranking by explicitly modeling how ranking decisions affect downstream text generation.
Practical Applicability: Providing a tractable solution (FARO) that balances the theoretical rigor of optimization with the computational constraints of real-world RAG applications.

The authors conclude that while their linear model and binary fairness definition are simplifications, they offer a robust foundation for controlling bias in multi-document RAG pipelines. They acknowledge that future work is needed to address non-linear interactions, multi-group fairness, and adaptive strategies for varying question distributions.

Fairness-Aware Retrieval Optimization for Retrieval-Augmented Generation