A Compression Based Classification Framework Using Symbolic Dynamics of Chaotic Maps

This paper introduces ChaosComp, a novel classification framework that leverages symbolic dynamics and chaotic maps to model class-specific transition probabilities, assigning labels based on the shortest compressed representation of test data to reinterpret classification through the lens of dynamical systems and information theory.

Parth Naik, Harikrishnan N B

Published 2026-03-26
📖 5 min read🧠 Deep dive

Imagine you are trying to teach a computer how to tell the difference between two types of fruit, say Apples and Oranges.

Usually, machine learning works like a detective looking for clues: "If it's red and round, it's an apple. If it's orange and bumpy, it's an orange." It builds a complex rulebook based on the data it sees.

This paper proposes a completely different approach. Instead of building a rulebook, they treat the data like a secret code and ask a simple question: "Which fruit's secret language can I write down the shortest way?"

Here is how their method, called ChaosComp, works, explained through a story.

1. The Chaos Map (The Loom)

Imagine you have a magical loom (a machine for weaving). This loom is a "Chaotic Map." It takes a number, stretches it, cuts it, and rearranges it in a very specific, unpredictable way.

  • If you feed it a number representing an Apple, the loom weaves a specific pattern.
  • If you feed it a number representing an Orange, it weaves a different pattern.

The magic is that this loom is chaotic. Tiny changes in the starting number lead to wildly different patterns. But, if you know the exact rules of the loom, you can reverse the process.

2. Turning Data into a String of Beads (Symbolic Dynamics)

The researchers take their data (like the size, weight, and color of a fruit) and turn it into a simple string of beads: 0s and 1s.

  • If a feature is "small," it's a 0.
  • If it's "big," it's a 1.

So, an Apple might look like 0-1-0-1-1, and an Orange might look like 1-0-1-0-0.

3. Learning the "Language" of Each Class (Training)

During the training phase, the computer looks at all the Apples it has ever seen. It counts how often the bead patterns appear.

  • "Oh, Apples usually have 0-1 together often, but 1-1 is rare."
  • "Oranges love 1-0 but hate 0-0."

Based on these counts, the computer builds a custom loom for Apples and a different custom loom for Oranges. Each loom is tuned specifically to the "rhythm" of that fruit.

4. The Compression Test (The "Zip File" Trick)

Now, you bring in a mystery fruit. You turn it into a bead string.

  • Step A: You try to feed this string into the Apple Loom. You run the loom backwards (like rewinding a tape). The loom tries to find the original starting number that would create this exact string.
    • If the string fits the Apple pattern perfectly, the loom finds a very specific, tiny starting number. This means the "Apple Loom" understands this data very well.
    • If the string is weird for an Apple, the loom gets confused, and the starting number becomes a huge, vague range.
  • Step B: You do the same with the Orange Loom.

The Golden Rule: The paper uses a principle from information theory called Minimum Description Length.

  • If the "Apple Loom" can explain the mystery fruit using a very tiny, precise starting number, it means the data is highly compressible for Apples. It's like saying, "I can describe this fruit in just 3 words because it fits my Apple dictionary perfectly."
  • If the "Orange Loom" needs a huge, vague range to explain it, it's a bad fit. It's like trying to describe an Apple using an Orange dictionary; you need 100 words to explain why it doesn't fit.

The Winner: The fruit that results in the shortest description (the smallest "file size") wins. If the mystery fruit can be compressed into a tiny file using the Apple model, the computer says, "It's an Apple!"

Why is this cool?

  • It's like a Zip file: Think of the computer not as a classifier, but as a file compressor. It asks, "Which class's dictionary allows me to zip this file down to the smallest size?"
  • It handles chaos: Real-world data is messy and chaotic. This method embraces that chaos instead of fighting it. It uses the mathematical properties of "chaotic maps" (which are known to be incredibly efficient at encoding information) to do the work.
  • No complex rules: It doesn't need to draw complex lines on a graph to separate classes. It just checks which "language" the data speaks most fluently.

The Results

The researchers tested this on real-world problems, like detecting breast cancer from medical scans or identifying different types of seeds.

  • On the Breast Cancer dataset, their method was incredibly accurate (95%+), beating many traditional methods.
  • It even solved tricky logic puzzles (like the XOR problem) that usually confuse simple machines, proving it can handle non-linear, messy data.

The Bottom Line

This paper suggests that learning is just compression. If a machine truly "understands" a category of data, it should be able to describe it using the fewest possible bits. By using chaotic maps as a tool to measure this "compressibility," they built a new kind of classifier that is simple, elegant, and surprisingly powerful.

Drowning in papers in your field?

Get daily digests of the most novel papers matching your research keywords — with technical summaries, in your language.

Try Digest →