LEA: Label Enumeration Attack in Vertical Federated Learning

This paper introduces LEA, a novel and practical Label Enumeration Attack in Vertical Federated Learning that bypasses the need for auxiliary data by utilizing clustering and first-round loss gradient similarity to efficiently enumerate label mappings, while overcoming computational bottlenecks through a Binary-LEA optimization that reduces complexity from n!n! to n3n^3 and remains effective against common defense mechanisms.

Wenhao Jiang, Shaojing Fu, Yuchuan Luo, Lin Liu

Published 2026-03-05
📖 5 min read🧠 Deep dive

Imagine a group of friends trying to solve a mystery together, but they are all in different rooms and can't share their full notebooks. This is Vertical Federated Learning (VFL).

  • The Setup: Everyone has different clues about the same set of suspects (samples).
    • Friend A (The Active Party): Has the "Answer Key" (the labels, like "Guilty" or "Innocent"), but no other clues.
    • Friend B (The Passive Party): Has a bunch of clues (features like height, shoe size, alibi), but no answer key.
  • The Goal: They want to build a super-smart detective AI together without Friend A ever showing the Answer Key to Friend B, and without Friend B showing their raw clues to Friend A.

The Problem: The Sneaky Spy

The paper introduces a new trick called LEA (Label Enumeration Attack). It's like a spy (Friend B) trying to figure out the Answer Key just by watching how the group solves the puzzle, without ever seeing the key itself.

Previous spy tricks had a big flaw: the spy needed to have a copy of the Answer Key for a few suspects beforehand to make the guess work. If the spy had zero prior knowledge, they were stuck.

LEA changes the game. The spy doesn't need any prior knowledge. They just need to be smart about how they group the suspects.

How the Attack Works (The "Guessing Game" Analogy)

Imagine the suspects are people at a party, and the Answer Key is their favorite ice cream flavor (Chocolate, Vanilla, or Strawberry). The spy (Passive Party) sees everyone's outfits but doesn't know their flavors.

Step 1: The Grouping (Clustering)
The spy looks at the outfits and says, "Okay, these 10 people are wearing all black, these 10 are in bright neon, and these 10 are in suits." They group them into three piles.

  • The Spy's Intuition: "People who dress similarly probably like the same ice cream."

Step 2: The "What If" Simulation (Enumeration)
Now, the spy plays a massive game of "What If."

  • Scenario A: Maybe the Black pile likes Chocolate, Neon likes Vanilla, Suits like Strawberry.
  • Scenario B: Maybe Black likes Vanilla, Neon likes Strawberry, Suits like Chocolate.
  • ...and so on.

If there are 10 flavors, there are millions of ways to assign flavors to these piles. The spy creates a million "fake detectives" (simulated models), each one assuming a different flavor assignment.

Step 3: The "First Step" Test (The Secret Sauce)
Here is the clever part. The spy doesn't wait for the fake detectives to finish their whole training (which would take forever). They just watch them take one single step of learning.

  • The spy watches the real group take one step.
  • The spy watches all the fake groups take one step.
  • The Magic: The fake detective whose "one step" looks most similar to the real group's step is the winner!

Why? Because if the spy guessed the ice cream flavors correctly, that fake detective will react to the clues in the exact same way the real detective does. If the guess is wrong, the reaction will be totally different.

Step 4: The Win
The spy picks the winning fake detective. Now, the spy has a model that knows the ice cream flavors. They can look at any new person and say, "That person is wearing a suit, so they must like Strawberry!" The spy has stolen the secret labels without ever asking for them.

The "Binary" Shortcut (Binary-LEA)

If there are 10 ice cream flavors, checking every single combination is like trying to find a needle in a haystack the size of a city. It takes too long.

The authors invented Binary-LEA. Instead of guessing all 10 flavors at once, the spy plays a simpler game:

  • "Is this person Chocolate OR Vanilla?" (Ignore the rest).
  • "Is this person Strawberry OR Mint?"
  • They break the big, impossible puzzle into many small, easy puzzles. This makes the attack 100,000 times faster and much more practical.

Can You Stop It? (The Defenses)

The paper tested common security shields:

  1. Adding Noise (Static): Imagine the group adds random static to their conversation. The spy found that even with static, they could still hear the "rhythm" of the correct guess.
  2. Compressing Data (Summarizing): Imagine the group only sends the most important words. The spy found that even with fewer words, the pattern was still clear enough to guess the answer.
  3. The "Code Book" (Label Mapping): The authors suggested a new defense: The Answer Key holder changes the names of the flavors (e.g., "Guilty" becomes "X", "Innocent" becomes "Y") before sharing.
    • Does it work? Yes, but only if the spy has no outside help. If the spy has even a tiny bit of outside info (like knowing one person is definitely "Guilty"), they can crack the code book and win again.

The Big Takeaway

This paper shows that in Vertical Federated Learning, just having your own data isn't enough to keep your secrets safe. Even if you don't have the labels, if you can group your data smartly and watch how the model learns, you can reverse-engineer the secrets.

It's like realizing that even if you don't have the answer key, you can figure out the answers just by watching how the teacher grades the test, provided you know how the students are sitting in the room.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →