COMI: Coarse-to-fine Context Compression via Marginal Information Gain

The paper introduces COMI, a coarse-to-fine adaptive context compression framework that utilizes a novel Marginal Information Gain metric to jointly optimize semantic relevance and diversity, significantly outperforming existing baselines in long-context tasks under high compression rates.

Jiwei Tang, Shilei Liu, Zhicheng Zhang, Yujin Yuan, Libin Zheng, Wenbo Su, Bo Zheng

Published 2026-03-09
📖 4 min read☕ Coffee break read

Here is an explanation of the COMI paper, translated into simple, everyday language using analogies.

The Big Problem: The "Too Much Information" Traffic Jam

Imagine you are a Detective (the AI) trying to solve a mystery (answer a question). You have a massive evidence board with 10,000 sticky notes (the long text context).

  • The Issue: Most of those sticky notes are useless. Some are just repeats of the same fact written in different ways. Some are completely irrelevant.
  • The Bottleneck: Your brain (the computer) can only look at a few notes at a time. If you try to read all 10,000 notes, you get overwhelmed, slow down, or miss the crucial clue because it's buried under the noise.
  • The Current Solution: Previous methods tried to shrink the evidence board by just picking the notes that seemed most related to the mystery. But they made a mistake: they picked 50 notes that all said the exact same thing. You still have 50 notes of the same info, which is a waste of space.

The New Solution: COMI (The Smart Editor)

The authors propose COMI, a new way to shrink that evidence board. They call it "Coarse-to-Fine Context Compression via Marginal Information Gain."

That's a mouthful, so let's break it down with a Library Analogy.

1. The Core Concept: "Marginal Information Gain" (MIG)

Imagine you are curating a "Best Of" playlist for a friend.

  • Relevance: You pick songs the friend likes.
  • Redundancy: But if you pick 50 different versions of the same song, that's annoying. It adds no new value.

MIG is a score that asks two questions for every piece of information:

  1. "How much does this help answer the question?" (Relevance)
  2. "How much is this just repeating what I already have?" (Redundancy)

The Rule: If a piece of info is super relevant but totally unique, give it a high score. If it's relevant but just a copy of something you already picked, give it a low score.

2. The Two-Step Process

COMI doesn't just delete things randomly; it uses a Coarse-to-Fine strategy (Big picture first, then details).

Step A: The Coarse-Grained "Group Reallocation" (The Big Picture)
Imagine the 10,000 sticky notes are divided into 100 piles (groups).

  • Old Way: You give every pile the same amount of space on the final board (e.g., 10 notes per pile).
  • COMI Way: You look at each pile.
    • Pile #1 has the smoking gun evidence. It gets more space (maybe 30 notes).
    • Pile #50 is just about the weather. It gets less space (maybe 2 notes).
    • Pile #20 has 10 notes that all say the same thing. It gets less space because the info is redundant.
  • Result: You allocate your limited "board space" to the piles that actually matter.

Step B: The Fine-Grained "Token Merging" (The Details)
Now, inside the important piles, you have to shrink the notes themselves.

  • Old Way: You just take the average of the notes. If you have 5 notes saying "The butler did it," the average is still just "The butler did it," but you wasted space on 5 notes.
  • COMI Way: You look at the notes inside the pile.
    • Note A says "The butler did it with a candlestick."
    • Note B says "The butler did it."
    • Note C says "The butler did it with a candlestick" (again).
    • COMI realizes Note A is the most unique and informative. It merges the group into a single, super-dense note that keeps the "candlestick" detail but drops the repetitive fluff.

Why is this better?

Think of it like packing for a trip.

  • Old Methods: You pack 10 identical t-shirts because they are all "good for summer." You run out of suitcase space and can't fit your shoes.
  • COMI: You pack 1 t-shirt (because 10 is redundant), 1 pair of shoes, and a swimsuit. You fit everything you actually need into a tiny bag.

The Results

The paper tested this on huge questions (like "Who killed the victim in this 500-page novel?") and summarization tasks.

  • The Win: Even when they forced the AI to shrink the text by 32 times (keeping only 1/32nd of the original words), COMI was much smarter than other methods.
  • The Score: On a test called "NaturalQuestions," COMI improved the accuracy by 25 points compared to the next best method. That's a massive jump.

Summary in One Sentence

COMI is a smart AI editor that doesn't just cut out the boring parts of a long story; it also deletes the parts that are just repeats of the good parts, ensuring the final summary is short, unique, and packed with the most important clues.