From Privacy to Trust in the Agentic Era: A Taxonomy of Challenges in Trustworthy Federated Learning Through the Lens of Trust Report 2.0

The Big Idea: From "Secret Keeping" to "Building Trust"

Imagine a group of doctors in different hospitals who want to build a super-smart AI to help diagnose cancer. They have a problem: they can't share their patients' private records with each other because of privacy laws.

Federated Learning (FL) is the solution they invented. Instead of sending patient data to a central server, they send the AI to the hospitals. The AI learns locally, figures out what it learned, and sends back only the "lessons" (math updates), not the patient data.

The Old Problem: For years, everyone thought, "If we keep the data private, we are safe. If the data is safe, the system is trustworthy."
The New Reality: The authors say, "Wait a minute. Just because the data is private doesn't mean the AI is behaving well."

Imagine a group of chefs cooking a giant stew together without mixing their ingredients in one pot. They just send back the taste of their spoonfuls.

Privacy ensures no one steals the secret recipes.
Trustworthiness ensures no one is secretly adding poison to their spoon, no one is lying about how good their soup tastes, and no one is changing the recipe while the stew is cooking without telling the head chef.

This paper argues that in the new era of Agentic AI (AI that can make its own decisions, like a smart robot butler), we need to move beyond just "hiding the data" and start "proving the system is good."

The Core Metaphor: The "Learning Plane" vs. The "Control Plane"

The authors introduce a crucial distinction to understand modern AI systems. Imagine a car:

The Learning Plane (The Engine): This is where the AI actually learns. It's the engine turning over, processing data, and getting better at driving.
The Control Plane (The Steering Wheel & Dashboard): This is where decisions are made. Who is driving? When do we stop? What route are we taking? Do we trust the GPS?

The Paper's Insight: In the past, we only worried about the engine (is it running smoothly?). But now, with "Agentic AI," the car can decide to change its own destination or speed up on its own. If the Control Plane is broken (e.g., the AI decides to drive off a cliff because it misunderstood a sign), it doesn't matter how good the engine is. The system is untrustworthy.

The "Trust Report 2.0": The Flight Recorder

To fix this, the authors propose a new tool called the Trust Report 2.0.

Think of this like a Flight Recorder (Black Box) for the AI, but instead of just recording crashes, it records decisions.

Old Way: "Here is the final model. It is 95% accurate. Trust us."
New Way (Trust Report 2.0): "Here is the log.
- Decision: We decided to stop training because the data looked weird.
- Reason: The AI noticed a pattern that didn't make sense (Drift).
- Who approved it: The human doctor.
- Privacy Check: We didn't look at any patient names.
- Result: We are safe to continue."

This report is lightweight (it doesn't reveal secrets) but auditable (you can check the math to see if they are telling the truth).

The 7 Pillars of Trust (The "Trust Checklist")

The paper organizes the challenges into 7 categories, based on European guidelines for ethical AI. Here is how they translate to our "Cooking Stew" analogy:

Human Agency (The Chef's Oversight): Can a human step in and say "Stop!" if the AI is doing something crazy? In a distributed system, it's hard to know who is in charge.
Robustness (The Poison Test): What if a bad actor tries to poison the stew? The system needs to be tough enough to ignore the poison.
Privacy (The Locked Recipe Box): We know this one. But it's not just about locking the box; it's about making sure the AI doesn't accidentally whisper the recipe while it's cooking.
Transparency (The Open Kitchen): Can we see why the AI made a decision? If it's a "black box," we can't trust it in a hospital.
Fairness (The Equal Spoon): Does the AI treat everyone equally? If the AI only learns from big hospitals, will it work for small rural clinics?
Societal Well-being (The Carbon Footprint): Is this AI too hungry? Does it use too much electricity to cook the stew?
Accountability (The Name Tag): If the AI makes a mistake and hurts a patient, who is responsible? The hospital? The software maker? The AI itself? We need to know who to blame.

The Stress Test: Cancer Research (Oncology)

The authors test their ideas on Cancer Research. This is the ultimate "stress test" because:

High Stakes: A mistake can kill someone.
Strict Rules: Privacy laws are super tight.
Changing Data: Cancer treatments change, and patient data changes over time.

They show that in this high-risk environment, you can't just say "We have privacy." You need the Trust Report to prove that the AI is being monitored, that humans are in the loop, and that if the AI starts acting weird, it gets shut down safely.

The Takeaway: Trust is a Habit, Not a Label

The main message of the paper is this: Trust is not a badge you put on a finished product.

Old View: "We built a secure AI. Here is the certificate. It is trustworthy."
New View: "Trust is a continuous habit. We check the AI every day. We log every decision. We have humans watching the steering wheel. We prove our trustworthiness every single round of training."

In the age of smart, autonomous AI, we don't just need to hide the data; we need to build a system where trust is proven, step-by-step, through clear rules and honest reporting.

1. Problem Statement

The paper addresses a critical gap in Federated Learning (FL): while FL successfully preserves data privacy by keeping data local, privacy guarantees alone are insufficient to sustain trust in modern, high-risk, and agentic AI environments.

The Shift to Agentic AI: FL systems are evolving from static, privacy-centric pipelines into agentic systems where autonomous agents (often powered by Large Language Models) make dynamic decisions regarding client selection, objective updates, evaluation gating, and deployment.
The Limitation of Current Approaches: Existing literature and frameworks treat trustworthiness as a static property of a trained model (e.g., robustness against poisoning or differential privacy). They fail to account for:
- Control-plane decisions: Autonomous actions that alter the learning process itself.
- Dynamic Environments: Non-stationary data distributions and concept drift over time.
- Multi-stakeholder Governance: The complexity of assigning accountability across distributed organizations.
Core Question: How can FL systems be governed to ensure trustworthiness not just at training time, but continuously throughout their operational lifecycle in the era of agentic AI?

2. Methodology

The authors employ a conceptual and architectural methodology grounded in the European Commission's Trustworthy AI (TAI) framework (seven key requirements). Their approach involves:

Requirement-Driven Taxonomy: Systematically mapping FL challenges against the seven TAI requirements (Human Agency, Robustness, Privacy, Transparency, Fairness, Societal Well-being, Accountability).
Conceptual Reframing: Distinguishing between the Learning Plane (model training/aggregation) and the Control Plane (autonomous decision-making, orchestration, and governance).
Blueprint Design: Proposing a coordination framework that treats trust as a "system-level operating condition" rather than a static label.
Artifact Definition: Designing a lightweight, privacy-preserving evidence artifact called Trust Report 2.0.
Stress-Testing: Validating the framework against the Healthcare/Oncology domain, characterized by high regulatory pressure, sensitive data, and dynamic clinical needs.

3. Key Contributions

A. A Taxonomy of Trust-Critical Challenges

The paper categorizes challenges under the seven TAI requirements, distinguishing between what is "Done" (solved), "Trends" (active research), and "To Do" (open gaps). Key areas include:

Human Agency: The difficulty of implementing Human-in-the-Loop (HITL) in distributed settings and defining autonomy boundaries (Levels A0–A3).
Robustness: Moving beyond poisoning attacks to include OOD detection in non-IID settings and evaluation integrity (preventing agents from manipulating metrics).
Privacy: Addressing semantic-layer leakage in LLM-enabled FL (where agents exchange natural language summaries) and model unlearning (GDPR "right to be forgotten").
Transparency & Accountability: Solving the "double black-box" problem in Vertical FL and establishing control-plane auditability (tracking why decisions were made, not just what parameters were updated).

B. The Learning Plane vs. Control Plane Distinction

A central theoretical contribution is the explicit separation of:

Learning Plane: Traditional FL operations (local training, aggregation).
Control Plane: Autonomous decisions (client admission, objective changes, deployment triggers).
The authors argue that trust failures often originate in the Control Plane (e.g., an agent selecting a biased client cohort or suppressing a regression signal), even if the Learning Plane remains mathematically sound.

C. Levels of Autonomy (A0–A3)

The paper defines a spectrum of autonomy to govern risk:

A0: Manual (Human defines all).
A1: Agent Recommends (Human approves).
A2: Agent Executes (Under strict policy constraints).
A3: Fully Autonomous (Agent defines and executes).
This framework dictates the necessary oversight and evidence requirements for each decision locus.

D. The Coordination Blueprint & Trust Report 2.0

To operationalize trust, the authors propose:

Coordination Blueprint: A decision-centric governance loop that links TAI requirements to FL-native controls (e.g., privacy-aware aggregation, local-first processing). It manages trade-offs (e.g., Fairness vs. Efficiency) through explicit governance policies.
Trust Report 2.0: A lightweight, privacy-preserving artifact that serves as the "evidence surface." It does not centralize raw data but aggregates:
- Decision Rationale: Why a specific action (e.g., retraining) was taken.
- Metrics: Privacy budgets ( $\epsilon, \delta$ ), robustness flags, and utility scores.
- Provenance: Versioning of policies and approval logs.
- Cadence: Generated per-round for monitoring and per-release for governance.

4. Results and Validation (The Oncology Stress Test)

The paper validates its framework using Oncology Federated Learning as a stress-test domain.

Scenario: Collaborative cancer diagnosis across multiple hospitals with heterogeneous data, strict privacy laws (GDPR/HIPAA), and high clinical risk.
Application of Trust Report 2.0:
- High Clinical Risk: Requires "Non-regression checks" and "Safety thresholds" in the report before model release.
- Data Heterogeneity: Requires "Subgroup performance indicators" to ensure fairness across diverse patient populations.
- Longitudinal Drift: Requires "Drift indicators" and "Retraining triggers" to be auditable.
- Regulatory Accountability: Requires "Versioned Trust Reports" and "Role attribution" to satisfy auditors without exposing patient data.
Outcome: The framework successfully demonstrates how to translate abstract TAI principles into concrete, auditable signals that satisfy clinicians, regulators, and data owners simultaneously.

5. Significance and Impact

Paradigm Shift: Moves the field from "Privacy-Centric" to "Trust-Centric" FL. It argues that in the agentic era, trust is a continuous, lifecycle-dependent condition rather than a one-time certification.
Governance over Algorithms: Provides a governance framework that is agnostic to specific learning algorithms, making it applicable to diverse FL architectures (Cross-silo, Cross-device, Vertical FL).
Operationalizing TAI: Bridges the gap between high-level ethical guidelines (TAI) and technical implementation by defining specific decision-to-evidence mappings.
Scalability for Agentic AI: Addresses the unique risks of LLM-enabled agents (semantic leakage, goal misalignment) which traditional FL security models do not cover.
Regulatory Readiness: Offers a practical path for compliance with emerging regulations (like the EU AI Act) by providing the "auditable evidence" required for liability attribution in decentralized systems.

In conclusion, the paper posits that for FL to be viable in high-stakes, agentic environments, it must evolve from a technical protocol for privacy into a governable system where trust is continuously evidenced through decision-centric artifacts like the Trust Report 2.0.