Fourier Analysis on the Boolean Hypercube via Hoeffding Functional Decomposition

This paper proposes an ANOVA-based generalization of Fourier analysis on the Boolean hypercube for arbitrary probability measures, offering an explicit decomposition basis and a least squares solution to overcome the curse of dimensionality, thereby enabling effective feature attribution in explainable AI tasks involving non-uniform data distributions like one-hot encoded features.

Baptiste Ferrere, Nicolas Bousquet, Fabrice Gamboa, Jean-Michel Loubes, Joseph Muré

Published 2026-03-03
📖 5 min read🧠 Deep dive

Imagine you are trying to understand a complex recipe for a delicious cake. You have a list of ingredients (flour, sugar, eggs, vanilla, etc.), but the cake doesn't just taste like the sum of its parts. The interaction between the eggs and the sugar matters just as much as the flour itself.

In the world of Machine Learning, models are like these recipes. They take many inputs (features) and produce an output (a prediction). Fourier Analysis is a mathematical tool that tries to break this recipe down into its simplest ingredients to see which ones matter most.

However, there's a catch. Traditional Fourier Analysis assumes that every ingredient in your pantry is equally likely to be used. It assumes a "uniform" world. But in real life, ingredients are correlated. If you use eggs, you probably use sugar. If you have a "one-hot" feature (like a color that can only be Red, Blue, or Green, but never two at once), the math gets messy because the standard tools assume these things are independent.

This paper introduces a new, smarter way to break down these recipes, even when the ingredients are tightly linked. Here is the breakdown using simple analogies:

1. The Old Way: The "Perfectly Balanced" Scale

Traditional Fourier analysis is like weighing ingredients on a scale that assumes every single combination of ingredients is equally probable.

  • The Problem: In the real world, data isn't balanced. If you are analyzing customer data, people who buy diapers are much more likely to buy beer than people who don't. The "uniform" scale gives you a distorted view because it doesn't know about these real-world connections.
  • The Result: Your explanation of why the model made a decision is slightly off because it's ignoring the hidden relationships between variables.

2. The New Way: The "Custom-Tailored" Scale

The authors propose a method called Hoeffding Functional Decomposition (HFD). Think of this as a scale that you can "calibrate" to the specific shape of your data.

  • The Innovation: They created a new set of "mathematical ingredients" (a basis) that automatically adjusts to how your data is distributed.
  • The Metaphor: Imagine you are trying to describe a crowd of people.
    • Old Method: You assume everyone is standing in a perfect grid, equally spaced.
    • New Method: You realize the crowd is clumped together in groups (friends talking, families standing close). Your new method creates a map that accounts for these clumps. It tells you, "Ah, this group of people is influencing the outcome together," rather than treating them as isolated individuals.

3. Solving the "Too Many Ingredients" Problem

The biggest headache in analyzing these models is the Curse of Dimensionality. If you have 20 ingredients, there are over a million possible combinations of ingredients you could mix together. Checking them all is impossible.

  • The Solution: The authors realized that in most real-world scenarios, you don't need to check every combination. Usually, the main ingredients (main effects) and the most obvious pairs (like flour + sugar) do 95% of the work.
  • The Trick: They use a technique called Regularization (specifically Elastic Net). Think of this as a "smart filter" that automatically turns off the noise. It says, "We only care about the main ingredients and the top 2-3 pairings." This turns an impossible math problem into a simple, fast calculation that a laptop can solve in seconds.

4. Why This Matters for AI (Explainable AI)

We often use tools like SHAP to explain why an AI made a decision (e.g., "The loan was denied because of your income").

  • The Issue: SHAP works great when data is independent, but it can get confused when data is correlated (like the "one-hot" encoding problem mentioned in the paper).
  • The Breakthrough: The authors show that their new method produces results that look almost identical to SHAP in simple cases, but stays accurate even when the data is messy and correlated.
  • The Benefit: It gives us a more honest explanation. If a model relies on a specific combination of correlated features, this method spots it correctly, whereas older methods might spread the credit (or blame) unevenly.

Summary in a Nutshell

  • The Problem: Standard math tools for explaining AI assume data is random and independent, which is rarely true in real life.
  • The Fix: The authors built a new mathematical "lens" (based on Hoeffding Decomposition) that adapts to the actual shape of your data.
  • The Magic: They turned a super-complex, slow calculation into a fast, linear equation by assuming that only the most important interactions matter.
  • The Result: We can now explain complex AI models more accurately, even when the data has tricky, real-world dependencies (like one-hot encoded categories or biological correlations).

In short, they took a rigid, one-size-fits-all ruler and replaced it with a flexible, stretchy tape measure that fits the data perfectly, allowing us to finally understand exactly how our AI models are thinking.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →