Extended Empirical Validation of the Explainability Solution Space

This technical report extends the empirical validation of the Explainability Solution Space (ESS) framework by demonstrating its domain-independent applicability and systematic adaptability to diverse governance roles and stakeholder configurations through a cross-domain evaluation involving both employee attrition and urban resource allocation systems.

Antoni Mestre, Manoli Albert, Miriam Gil, Vicente Pelechano

Published 2026-03-10
📖 5 min read🧠 Deep dive

Here is an explanation of the technical report, translated into everyday language using analogies to make the concepts clear.

🏦 The Big Picture: The "Black Box" Bank Problem

Imagine a massive, high-speed bank that processes millions of credit card transactions every day. To stop fraudsters, the bank uses a super-smart computer brain (an AI) that decides in a split second: "Keep this card safe" or "Block this card."

The problem is that this computer brain is a "Black Box." It makes the right decision 97% of the time, but no one knows why. If it blocks a card, the customer gets angry, the bank gets sued, and regulators ask, "How can you prove you didn't discriminate?"

This report is about building a "Glass Box" around that computer brain. It tests a new method called the ESS (Explainability Solution Space) to figure out the best way to explain the AI's decisions to three different groups of people, all while keeping the system fast enough to work in real-time.


🎯 The Three Audiences (The Stakeholders)

The report realizes that one explanation doesn't fit all. It's like trying to explain a car crash to three different people:

  1. The Regulators (The Auditors): They need a forensic lab report. They don't care if it's pretty; they need a tamper-proof, mathematical proof that the decision was fair and followed the law.
    • Analogy: They want the "black box" to be a safe deposit box with a clear audit trail.
  2. The Customer Service Agents (The Users): They need a simple story to tell the angry customer. They can't say "The Shapley value of feature X was 0.4." They need to say, "We blocked it because you spent $500 in a country you've never visited."
    • Analogy: They want a plain-English translation of the decision.
  3. The Data Scientists (The Developers): They need debugging tools. If the AI starts making weird mistakes, they need to see the code and the data to fix it.
    • Analogy: They want the engineer's blueprint to see where the gears are grinding.

🧪 The Experiment: Testing Five "Flashlight" Tools

The authors tested five different "flashlights" (AI explanation tools) to see which one shines the brightest for each group. Think of these as different ways to shine a light into the dark Black Box:

  1. SHAP (The Precise Measurer): Like a laser scanner. It breaks down exactly how much each factor (price, location, time) contributed to the decision. It's mathematically perfect but a bit technical.
  2. LIME (The Local Approximator): Like a sketch artist. It draws a rough, simple picture of what the AI is thinking right now for this specific transaction.
  3. Counterfactuals (The "What If" Machine): Like a video game "Undo" button. It tells you: "Your card was blocked. But if you had spent $10 less, it would have worked." This is super helpful for customers.
  4. Rule Extraction (The Rulebook): Like a flowchart. It turns the complex AI into a simple list of "If this, then that" rules. Great for auditors, but hard to make in real-time.
  5. Prototypes (The Lookalike Finder): Like a mugshot book. It says, "We blocked you because this transaction looks exactly like 50 other known frauds we saw last week."

⚡ The Twist: The 200-Millisecond Speed Limit

Here is the catch: The bank processes 4.2 million transactions a day. The AI has 200 milliseconds (0.2 seconds) to decide and explain the decision. If it takes too long, the customer's card gets stuck at the checkout line.

  • The Problem: The "Rulebook" (Rule Extraction) is great for auditors but takes too long to generate. The "What If" (Counterfactuals) is great for customers but is computationally heavy.
  • The Solution: You can't use just one tool. You need a Hybrid Strategy.

🏆 The Winning Strategy: The "Three-Tier" System

The report concludes that the best way to run this bank is to use a tiered approach, like a hospital triage system:

Tier 1: The "Always-On" Guard (SHAP)

  • Who it's for: The Regulators and the Developers.
  • What it does: For every single transaction, the system runs the SHAP tool. It's fast (under 50ms) and gives a mathematically perfect log.
  • Why: It satisfies the law and helps engineers debug the system without slowing anything down.

Tier 2: The "Emergency" Response (Counterfactuals)

  • Who it's for: The Customer Service Agents and Angry Customers.
  • What it does: Only if a card is blocked and the customer calls to complain, the system runs the "What If" tool.
  • Why: It takes a bit longer (100ms), but it gives the agent a perfect, simple sentence to tell the customer: "We blocked you because the amount was too high for your usual spending pattern." This solves the customer's problem.

Tier 3: The "Weekly" Audit (Rule Extraction)

  • Who it's for: The Regulators (for big-picture checks).
  • What it does: Once a week, when the bank is quiet, the system runs the Rulebook tool offline.
  • Why: It's too slow for real-time, but it creates a giant, easy-to-read manual that proves the AI isn't biased.

💡 The Big Takeaway

The report proves that there is no "one size fits all" explanation.

If you try to use one tool for everyone, you either break the law (too slow), confuse the customer (too technical), or annoy the engineers (not detailed enough).

The ESS (Explainability Solution Space) is like a smart menu that helps banks choose the right tool for the right job. By mixing SHAP (for speed and law), Counterfactuals (for human empathy), and Rule Extraction (for big-picture safety), the bank can be fast, fair, and compliant all at once.

In short: Don't just explain the AI; explain it differently to the judge, the customer, and the mechanic. That's the secret to trustworthy AI.