VFEFL: Privacy-Preserving Federated Learning against Malicious Clients via Verifiable Functional Encryption

Imagine a group of doctors from different hospitals wanting to build a super-smart AI to diagnose diseases. They all have patient data, but due to privacy laws and ethical concerns, they cannot share the actual patient records with each other or a central server.

Federated Learning is the solution: instead of moving the data, they move the learning. Each doctor trains the AI on their own local computers and sends only the "lessons learned" (the model updates) to a central server, which combines them into one master AI.

However, this system has two big problems:

Privacy Leaks: Even though they aren't sending patient records, the "lessons learned" can sometimes be reverse-engineered to reveal private details.
The "Bad Apple" Problem: A malicious doctor (or a hacker pretending to be one) could send fake, corrupted lessons to ruin the master AI, making it useless or dangerous.

This paper introduces VFEFL, a new system that acts like a high-tech, unbreakable secure vault to solve both problems without needing a "super-trusted" third party.

Here is how it works, broken down with simple analogies:

1. The Problem with Current Systems

Imagine the doctors are sending their lessons in plain text (like a postcard).

The Privacy Risk: Anyone who intercepts the postcard can read the secrets.
The Trust Issue: To stop bad apples, current systems often require two separate servers that promise never to talk to each other (a "non-colluding" assumption). It's like hiring two different security guards and hoping they never conspire to steal the keys. This is hard to set up and expensive.

2. The VFEFL Solution: The "Magic Envelope"

The authors propose a system based on Verifiable Functional Encryption. Think of this as a Magic Envelope with three superpowers:

Power 1: The Locked Box (Privacy)
When a doctor sends their lesson, it goes into a Magic Envelope. The central server can use the lesson to update the master AI, but it cannot open the envelope to see the actual lesson or the patient data inside. It's like a bank teller who can add money to your account without ever seeing your ID or knowing your balance.
Power 2: The Self-Checking Seal (Verifiability)
Usually, if you lock a box, you have to trust the person who locked it. But what if they put a brick inside instead of gold?
VFEFL adds a Self-Checking Seal. Before the server accepts the envelope, it runs a mathematical test (a "Zero-Knowledge Proof"). This is like a seal that proves, "I promise this envelope contains a valid lesson, and I didn't swap it for a brick," without the server needing to open the envelope to check. If the seal is broken or fake, the server rejects it immediately.
Power 3: The "No Middleman" Rule (Self-Contained)
Most secure systems need two guards who don't talk to each other. VFEFL is different. It uses a clever mathematical trick (called Cross-Ciphertext Decentralized Verifiable Functional Encryption) where the doctors themselves help generate the keys. They don't need a "Super Trusted Third Party." The system works with just one server and the group of doctors. It's like a group of friends building a safe together where no single person holds the master key.

3. The "Smart Filter" (Robust Aggregation)

Even with the Magic Envelopes, a bad actor might try to send a valid-looking envelope that contains a lesson designed to crash the AI (like a "poison pill").

The paper introduces a new Aggregation Rule (a way of mixing the lessons):

The Baseline: The server has a small, clean dataset (like a "Gold Standard" textbook) and creates a "Baseline Model" (the ideal lesson).
The Compass: When a new lesson arrives, the system checks: "Does this lesson point in the same direction as the Gold Standard?"
The Magnitude Check: It also checks: "Is this lesson too huge?" (Bad actors often try to overwhelm the system by sending massive, distorted updates).
The Result: If a lesson points the wrong way or is too huge, the system shrinks it or ignores it. It's like a filter that only lets through water that flows in the right direction and isn't a tsunami.

4. The Results: What Happened?

The authors tested this system with real data (like handwritten digits and fashion items).

Privacy: The server learned nothing about the individual doctors' data.
Security: Even when 20% of the doctors were "bad actors" trying to ruin the AI, the system successfully filtered them out. The final AI remained accurate.
Efficiency: It wasn't too slow. While the math is complex, the system runs fast enough to be used in the real world.

Summary

VFEFL is like a secure, self-policing classroom where students (clients) submit their homework (models) in locked, self-verifying boxes. The teacher (server) can grade the class and improve the curriculum without ever seeing the individual homework, and the system automatically kicks out anyone trying to cheat or sabotage the class, all without needing a principal to watch over the teacher.

It solves the privacy vs. security dilemma by using advanced math to create a system that is private, robust, and doesn't rely on trusting anyone else.

Here is a detailed technical summary of the paper "VFEFL: Privacy-preserving federated learning against malicious clients via verifiable functional encryption."

1. Problem Statement

Federated Learning (FL) allows collaborative model training without sharing raw data, but it faces two critical challenges:

Privacy Leakage: Transmitting local model updates in plaintext makes them vulnerable to Model Inversion Attacks, where adversaries reconstruct private training data from the model parameters.
Malicious Client Attacks: The distributed nature of FL makes it susceptible to Byzantine attacks. Malicious clients can upload poisoned models (e.g., Gaussian noise, scaling attacks, label flipping) to degrade global model accuracy or disrupt the decryption process.

Existing Limitations:

Most privacy-preserving FL schemes rely on Homomorphic Encryption (HE) or Differential Privacy (DP). HE often requires two non-colluding servers or a trusted third party (TTP) for key management, which is a strong assumption difficult to meet in practice. DP introduces noise that degrades model accuracy (fidelity).
Existing robust aggregation methods often fail to verify encrypted models or are incompatible with zero-knowledge proofs (ZKPs), making it hard to detect malicious behavior without compromising privacy.

2. Methodology: The VFEFL Framework

The authors propose VFEFL, a framework that integrates Verifiable Functional Encryption (VFE) with a novel robust aggregation rule. It operates in a standard single-server, multi-client architecture without requiring non-colluding servers or TTPs.

A. Core Cryptographic Primitive: CC-DVFE

The paper introduces a new scheme called Cross-Ciphertext Decentralized Verifiable Functional Encryption (CC-DVFE) for inner products.

Decentralized: Clients jointly generate keys without a trusted authority.
Verifiable: It supports Cross-Ciphertext Verification, allowing the server to verify relationships between multiple ciphertexts (not just individual ones) using Zero-Knowledge Proofs (ZKPs).
Functionality: It allows the server to compute the inner product of encrypted local models ( $\langle \vec{x}, \vec{y} \rangle$ ) without decrypting individual models, preserving privacy.
Security: The scheme is proven secure under the Decisional Diffie-Hellman (DDH), Multi-DDH, and Hard Subgroup Membership (HSM) assumptions.

B. Novel Robust Aggregation Rule

To defend against malicious clients, VFEFL introduces a new aggregation rule inspired by FLTrust but adapted for encrypted data:

Baseline Model: The server maintains a small clean dataset ( $D_0$ ) to train a trusted baseline model ( $W_0^t$ ) for each epoch.
Directional & Magnitude Check: Instead of simple averaging, the rule calculates a trust score based on the cosine similarity (direction) and magnitude of the client's local model ( $W_i^t$ ) relative to the baseline.
ReLU Filtering: The aggregation weight is calculated as:
$y_i = \text{ReLU}\left( \frac{\langle W_i^t, W_0^t \rangle}{\langle W_i^t, W_i^t \rangle} \right)$
This filters out models that point in the wrong direction (negative inner product) or have suspiciously large magnitudes (scaling attacks).
Normalization: The final global update is normalized to match the magnitude of the baseline model to ensure stability.

C. Workflow

Setup: Server and clients generate keys via CC-DVFE.
Training: Clients train locally, encrypt models, and generate ZKPs for the ciphertexts and functional key shares.
Verification: The server verifies the ZKPs. If a client fails verification (malicious), they are excluded.
Aggregation: The server aggregates the verified key shares, decrypts the inner product, and applies the robust aggregation rule to update the global model.

3. Key Contributions

Novel CC-DVFE Scheme: A new cryptographic primitive enabling verification of specific relationships over multi-dimensional ciphertexts, formally defined with security proofs.
Self-Contained Architecture: VFEFL is the first scheme to achieve privacy and robustness in a single-server setting without relying on non-colluding servers or trusted third parties.
Robust Aggregation for Encrypted Data: A new aggregation rule that effectively detects and neutralizes scaling and adaptive attacks while operating on encrypted data, avoiding the need for complex ZKPs for cosine similarity.
Comprehensive Analysis: Formal proofs for privacy (indistinguishability), robustness (bounded error), and verifiability, alongside empirical validation.

4. Experimental Results

The authors evaluated VFEFL on MNIST, Fashion-MNIST, and CIFAR-10 datasets under various attack scenarios (Gaussian, Scaling, Adaptive, and Label Flipping).

Fidelity (No Attack): VFEFL achieves accuracy comparable to standard FedAvg (e.g., ~99% on MNIST), proving that the encryption and aggregation do not degrade model performance in benign settings.
Robustness:
- Scaling Attacks: VFEFL maintains high accuracy (<0.1% drop) against scaling attacks, whereas baseline methods (like FedAvg, Krum, BSR-FL) fail significantly because they cannot verify the magnitude of encrypted models.
- Adaptive & Gaussian Attacks: The scheme effectively filters out malicious updates, maintaining high global model accuracy.
- Label Flipping: The framework successfully suppresses the Attack Success Rate (ASR) while maintaining high classification accuracy.
Efficiency: While cryptographic operations (especially discrete logarithm solving) introduce overhead, the system remains within acceptable bounds for real-world deployment. The setup phase has some uncertainty due to prime generation, but training and aggregation times scale linearly with the number of clients.

5. Significance

Practical Deployment: By removing the need for non-colluding servers, VFEFL makes privacy-preserving, robust FL feasible for real-world applications (e.g., healthcare, finance) where trust assumptions are strict.
Security-Utility Trade-off: It successfully balances high security (privacy + robustness) with high utility (model accuracy), overcoming the typical trade-off where privacy mechanisms reduce model fidelity.
Advancement in Cryptography: The introduction of Cross-Ciphertext verification expands the capabilities of Functional Encryption, enabling complex verification tasks (like robust aggregation) directly on encrypted data.

In conclusion, VFEFL provides a robust, privacy-preserving solution for Federated Learning that effectively mitigates malicious client attacks without compromising model accuracy or requiring unrealistic trust assumptions.

VFEFL: Privacy-Preserving Federated Learning against Malicious Clients via Verifiable Functional Encryption

1. The Problem with Current Systems

2. The VFEFL Solution: The "Magic Envelope"

3. The "Smart Filter" (Robust Aggregation)

4. The Results: What Happened?

Summary

1. Problem Statement

2. Methodology: The VFEFL Framework

A. Core Cryptographic Primitive: CC-DVFE

B. Novel Robust Aggregation Rule

C. Workflow

3. Key Contributions

4. Experimental Results

5. Significance

More like this

Explainable machine learning for predicting shellfish toxicity in the Adriatic Sea using long-term monitoring data of HABs

Talking like Piping and Instrumentation Diagrams (P&IDs)

SCAM: A Real-World Typographic Robustness Evaluation for Multimodal Foundation Models

IntrinsicWeather: Controllable Weather Editing in Intrinsic Space

Expert Evaluation of LLM World Models: A High-TcT_cTc​ Superconductivity Case Study

Expert Evaluation of LLM World Models: A High- $T_c$ Superconductivity Case Study