BrepCoder: A Unified Multimodal Large Language Model for Multi-task B-rep Reasoning

BrepCoder is a unified Multimodal Large Language Model that interprets industry-standard Boundary Representation (B-rep) data as structural code to enable a single model to perform diverse CAD tasks, such as completion, error correction, and question answering, through a two-stage training strategy focused on geometric feature learning and task generalization.

Mingi Kim, Yongjun Kim, Jungwoo Kang, Hyungki Kim

Published 2026-03-03
📖 4 min read☕ Coffee break read

Imagine you are a master architect who has spent a lifetime building incredible 3D structures. You have a library of blueprints, but they are written in a secret, complex code that only a few people understand.

For a long time, computers trying to learn from these blueprints had to be trained as specialists. If you wanted a computer to fix a broken wall, you built a "Wall-Fixer Robot." If you wanted to finish a half-built house, you built a "House-Completer Robot." These robots were great at their one job, but they couldn't talk to each other, and if you gave them a new task, they were useless.

BrepCoder is like a Universal Genius Apprentice that changes the game. Instead of being a specialist, it's a single, smart model that can do everything: fix errors, finish designs, answer questions, and even reverse-engineer a finished building back into its original blueprint.

Here is how it works, broken down into simple concepts:

1. The Language Problem: Speaking "Code" instead of "Pictures"

Most AI models look at 3D objects as clouds of dots (like a digital spray-paint job) or flat pictures. But in the real world of engineering, buildings are stored as B-reps (Boundary Representations). Think of a B-rep not as a picture, but as a mathematical recipe that defines exactly how every curve, edge, and surface connects.

The problem is that standard AI doesn't speak "Mathematical Recipe."

  • The Old Way: Trying to teach the AI to recognize a dot-cloud. It's like trying to teach a chef to cook by showing them a blurry photo of the meal.
  • The BrepCoder Way: The authors realized that CAD (Computer-Aided Design) is actually just code. They translated these complex 3D recipes into Python-like code (a language computers already know how to read).
    • Analogy: Instead of showing the AI a picture of a chair, they gave it the instruction manual: "Take a square, cut a hole here, and glue a leg there."

2. The Two-Stage Training: "Apprentice" then "Master"

The model learns in two distinct phases, like a student going to school.

Stage 1: The Reverse Engineering School (Learning the Logic)
Imagine giving the apprentice a finished, beautiful sculpture and asking them to write down the exact steps to build it from scratch.

  • The model looks at the 3D shape (the B-rep) and tries to write the code (the Python instructions) that created it.
  • By doing this thousands of times, the model doesn't just memorize shapes; it learns the logic of design. It understands why a curve exists and how a hole was cut. It internalizes the "grammar" of engineering.

Stage 2: The Multi-Task Intern (Applying the Logic)
Now that the model understands the logic, it can do anything:

  • Completion: You give it the first half of a code (a half-built house), and it finishes the rest.
  • Error Correction: You give it a code with a mistake (like a wall built in the wrong place), and it fixes the code to match the intended shape.
  • CAD-QA: You ask, "How many windows are in this room?" and it counts them by "reading" the code, not just guessing from a picture.

3. Why is this a Big Deal?

  • One Brain, Many Jobs: Before this, you needed a different AI for every task. Now, you have one "Universal Agent" that can switch hats instantly.
  • It Speaks the Industry's Language: Because it uses the actual industry-standard format (B-rep) and translates it to code, it doesn't lose information. It's like speaking to a human in their native tongue rather than a broken translation.
  • It's Smarter than "Guessing": Other models look at a 2D image and guess the 3D shape (which often leads to errors). BrepCoder looks at the structural "code" of the object, so it knows the exact dimensions and connections.

The Bottom Line

BrepCoder is a bridge between the messy, complex world of 3D engineering and the logical, powerful world of Large Language Models. By teaching the AI to see 3D designs as code, it has created a "General Purpose CAD Agent" that can think, reason, and build just like a human engineer, but much faster and without getting tired.

It's the difference between giving a robot a photo of a cake and asking it to bake one, versus giving it the recipe and letting it bake the perfect cake every time.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →