Dictionary Based Pattern Entropy for Causal Direction Discovery

This paper introduces Dictionary Based Pattern Entropy (DPE), a novel framework that combines Algorithmic and Shannon Information Theories to infer causal directions and identify driving subpatterns in symbolic sequences by quantifying how compact, rule-based patterns in a cause systematically reduce uncertainty in an effect, demonstrating robust performance across diverse synthetic and real-world datasets.

Harikrishnan N B, Shubham Bhilare, Aditi Kathpalia, Nithin Nagaraj

Published 2026-03-06
📖 5 min read🧠 Deep dive

Imagine you are a detective trying to solve a mystery: Who is influencing whom?

You have two friends, let's call them Alex and Jamie. You watch them for a day. Every time Alex sneezes, Jamie jumps. Every time Jamie claps, Alex smiles. But who is causing the reaction? Is Alex sneezing because Jamie clapped? Or is Jamie jumping because Alex sneezed?

In the world of data science, this is called Causal Discovery. Usually, computers need to know the "rules of the game" (like physics equations) or have massive amounts of data to figure this out. But what if the data is just a string of symbols (like 0s and 1s) and you don't know the rules? That's where this paper comes in.

The authors propose a new method called Dictionary Based Pattern Entropy (DPE). Here is how it works, explained simply:

1. The Core Idea: "The Rulebook"

Imagine Alex and Jamie are playing a secret code game.

  • The Old Way: Most methods try to guess the whole mathematical formula connecting them. It's like trying to solve a complex equation without knowing the variables.
  • The DPE Way: This method says, "Let's just look for repeating patterns."

It assumes that if Alex is truly the boss (the cause), there will be specific, compact "chunks" of Alex's behavior that always trigger a specific reaction in Jamie. These chunks are like secret handshakes.

2. Step-by-Step: How the Detective Works

Step A: Build the Dictionary (The "Cheat Sheet")

The computer watches the two streams of data (Alex and Jamie).

  • It looks at every time Jamie changes their behavior (e.g., goes from calm to jumping).

  • It looks backwards at what Alex was doing just before that change.

  • It writes down those specific "Alex patterns" in a Dictionary.

  • Analogy: Imagine you are a chef. Every time the oven beeps (the change), you look at what you did 5 minutes ago. You write down: "When I put the dough in, the oven beeps." You build a dictionary of "Dough In" \rightarrow "Beep."

Step B: The "Flip" Test (The "Did it work?" Check)

Now, the computer takes every pattern in its dictionary and asks: "Does this pattern always cause a change?"

  • If the pattern "Dough In" appears 10 times, and the oven beeps 10 times, that's a perfect rule. (High certainty).
  • If the pattern appears 10 times, but the oven only beeps 3 times, that's a weak rule. (High uncertainty).

The method calculates a score called Response Determinism.

  • Score of 1.0: "Whenever I see this pattern, the change always happens." (Very strong cause).
  • Score of 0.5: "Sometimes it happens, sometimes it doesn't." (Weak cause).
  • Score of 0.0: "This pattern has nothing to do with the change."

Step C: The Entropy Score (The "Confusion Meter")

This is the magic part. The method calculates Entropy, which is basically a measure of confusion or surprise.

  • Low Entropy (Low Confusion): The pattern is a reliable rule. "If A, then B." The future is predictable.
  • High Entropy (High Confusion): The pattern is random. "If A, maybe B, maybe C." The future is a mystery.

The Verdict:
The computer compares the two directions:

  1. Alex \rightarrow Jamie: How much confusion is there when we try to predict Jamie based on Alex's patterns?
  2. Jamie \rightarrow Alex: How much confusion is there when we try to predict Alex based on Jamie's patterns?

The Winner: The direction with the lowest confusion (lowest entropy) is the true cause. Why? Because the true cause usually has a clear, rule-based structure. The effect is often messy and noisy.

3. Why is this special?

Most other methods are like trying to guess the weather by looking at the whole sky at once. They get confused easily.
DPE is like looking for specific cloud shapes that always mean rain.

  • It doesn't need to know the physics of clouds.
  • It doesn't need millions of years of data.
  • It just finds the specific sub-patterns that drive the change.

4. Real-World Examples from the Paper

The authors tested their "Detective" on several scenarios:

  • Delayed Bit-Flips: Like a game of "Red Light, Green Light" where the reaction is slightly delayed. DPE figured it out 99% of the time.
  • Predator-Prey: In nature, predators eat prey, which then changes the predator's population. DPE correctly identified that the predator drives the prey's movement more than the other way around.
  • Virus Evolution: They looked at SARS-CoV-2 virus sequences to see if the global virus caused local mutations or vice versa. DPE gave competitive results, helping scientists understand how the virus spreads.

The Big Takeaway

This paper introduces a tool that finds cause and effect by looking for reliable patterns rather than complex math.

  • If you see a pattern that consistently triggers a change, you've found the cause.
  • If the relationship is messy and unpredictable, it's likely just a correlation, not a cause.

It's a way to cut through the noise and find the "secret handshakes" that govern how things influence each other in our chaotic world.