Draft-Conditioned Constrained Decoding for Structured Generation in LLMs

The paper proposes Draft-Conditioned Constrained Decoding (DCCD), a training-free two-step inference method that decouples semantic planning from structural enforcement to significantly improve the accuracy and parameter efficiency of structured generation in large language models by mitigating the distortions caused by hard constraints.

Avinash Reddy, Thayne T. Walker, James S. Ide, Amrit Singh Bedi

Published 2026-03-05
📖 5 min read🧠 Deep dive

Here is an explanation of the paper "Draft-Conditioned Constrained Decoding (DCCD)" using simple language and creative analogies.

The Big Problem: The "Strict Architect" vs. The "Creative Builder"

Imagine you have a brilliant Creative Builder (the AI model). This builder is amazing at solving complex math problems, writing stories, or figuring out logic puzzles. However, they have a terrible habit: they love to ramble, use the wrong punctuation, or forget to put a closing bracket at the end of a sentence.

Now, imagine you need this builder to construct a very specific type of house: a JSON House. This house has strict rules. Every room must be a specific shape, every door must be labeled exactly right, and if you miss even one comma, the whole house collapses and becomes unusable.

To fix the builder's bad habits, you hire a Strict Architect (Standard Constrained Decoding). The Architect stands next to the builder and says, "No, you can't write 'The answer is 14.' You must write {"answer": "14"}. If you try to write anything else, I will block your hand."

The Catch:
Because the Architect is so strict, they constantly interrupt the builder's flow. The builder gets confused, starts guessing, and often ends up building a house that looks perfect on the outside (valid JSON) but has the wrong rooms inside (the wrong math answer). The builder is so busy trying not to break the rules that they forget what they are actually trying to build.

The Paper's Solution: The "Draft-Then-Build" Method

The authors of this paper propose a new way to work, which they call Draft-Conditioned Constrained Decoding (DCCD). Instead of forcing the builder to follow the rules while they are thinking, they split the job into two distinct steps.

Step 1: The "Messy Draft" (Unconstrained Generation)

First, you tell the Creative Builder: "Ignore the rules for a second. Just think out loud and write down your solution however you want. Don't worry about commas, brackets, or JSON format. Just get the right answer."

The builder happily writes a long, messy, perfect explanation: "Okay, so if Janet has 16 eggs, eats 3, and bakes 4, she has 9 left. 9 times 2 dollars is 18 dollars. The answer is 18."

Why this helps: The builder is now free to use their full brainpower to solve the problem without being distracted by the strict rules. They produce a high-quality "semantic plan."

Step 2: The "Strict Translation" (Constrained Decoding)

Now, you take that messy draft and hand it to the Strict Architect. You say: "Okay, Architect, look at this draft. Your only job is to translate this messy text into a perfect JSON house. You must follow the rules, but you already know the answer is 18, so you just need to fit it into the box."

Because the Architect already knows the answer (thanks to the draft), they don't have to guess. They simply format the known correct answer into the strict structure.

The Result: You get a house that is both structurally perfect (valid JSON) and semantically correct (the right math answer).

Why This Works: The "Feasible Mass" Analogy

The paper uses a fancy math term called "Feasible Mass," but let's call it "The Probability of Success."

  • Old Way (Standard Decoding): The builder is trying to guess the answer while following strict rules. At every step, the rules block 90% of the possible words the builder wants to say. The builder is forced to pick from the tiny 10% that are allowed, even if those words are wrong. It's like trying to drive a car while someone keeps changing the road signs. The car (the AI) gets lost.
  • New Way (DCCD): The builder first figures out the destination (the draft). Now, when the Architect steps in to format the route, the destination is already clear. The "road signs" (the rules) no longer confuse the driver because the driver already knows where they are going. The probability of picking the right word skyrockets because the context is already set.

The Key Benefits

  1. Better Accuracy: By separating "thinking" from "formatting," the AI makes fewer mistakes. In the paper's tests, small AI models (like a 1-billion-parameter model) jumped from getting 15% of answers right to 39% right just by using this method.
  2. Cheaper & Faster: You don't need a massive, expensive AI to do this. You can use a small AI to write the draft and an even smaller AI to do the formatting. This is like hiring a junior architect to do the blueprints and a senior architect just to check the code, rather than hiring a super-expensive master builder for the whole job.
  3. No Training Needed: You don't have to re-teach the AI anything. You just change how you ask it to work (the two-step process).

Summary Analogy: The Essay vs. The Form

Imagine you are applying for a visa.

  • The Old Way: You try to write your life story directly into the tiny, rigid boxes on the official government form. You run out of space, you miss a letter, and your application gets rejected because you couldn't fit your story into the boxes.
  • The DCCD Way: You first write your life story on a blank piece of paper (the Draft). You make sure it's perfect, detailed, and correct. Then, you take that perfect story and carefully copy it into the official form boxes (the Constraint).

The paper proves that copying a perfect story into a form is much easier and more accurate than trying to write the story inside the form in the first place.