CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation

Imagine you want to build a custom piece of furniture, like a complex bookshelf with specific dimensions, curved shelves, and precise holes for bolts. You ask a very smart, but slightly scatterbrained, AI assistant to write the instructions for a robot to build it.

In the past, if you asked an AI to do this, it would try to write the instructions once, hand them to the robot, and hope for the best.

The Problem: The AI often gets the math wrong (the shelf is 2 inches too short) or the logic wrong (the robot tries to glue a piece that doesn't exist). The robot might crash, or build something that looks okay from a distance but falls apart if you touch it.
The Old Way: You'd look at the broken shelf, say "Oops," and ask the AI to try again from scratch. It's slow, and the AI keeps making the same mistakes because it doesn't "know" exactly why it failed.

CADSmith is like hiring a specialized construction team instead of a single assistant. They work together in a loop to ensure the final product is perfect. Here is how they do it, using a simple analogy:

The Team (The Multi-Agent Pipeline)

Instead of one person doing everything, CADSmith splits the job into five specialized roles:

The Architect (Planner): You tell the team, "I need a bookshelf." The Architect doesn't build anything; they just break your request down into a clear, step-by-step blueprint. They make sure everyone understands the exact measurements (e.g., "The shelf must be exactly 50mm wide").
The Draftsman (Coder): This person writes the actual code (the instructions) for the robot. But they don't just guess! They have a library of manuals (Retrieval-Augmented Generation) right next to them. If they need to know how to cut a specific curve, they look it up in the manual instead of making it up. This stops them from "hallucinating" fake tools.
The Foreman (Executor): This is the robot that tries to run the instructions.
- Inner Loop: If the robot trips over its own feet (a coding error), the Foreman stops, reads the error message, and asks the Fixer to rewrite the instructions. They try this up to three times until the robot can actually start moving.
The Inspector (Validator): Once the robot builds the shelf, the Inspector checks it. This is the magic part. The Inspector has two superpowers:
- The Laser Tape Measure (Programmatic Validation): It uses a super-precise digital ruler (OpenCASCADE) to measure the shelf. Is it exactly 50mm? Is the volume correct? Is it watertight (no holes)? This gives exact numbers.
- The Human Eye (Vision Judge): It also looks at 3D photos of the shelf from three different angles. Sometimes, the numbers look right, but the shape looks weird (like a shelf that is the right size but has the wrong number of holes). The "Human Eye" catches these visual mistakes that a ruler might miss.
The Fixer (Refiner): If the Inspector finds a problem (e.g., "The shelf is 2mm too short" or "There are only 4 holes instead of 6"), they don't just say "Try again." They give the Draftsman a specific note: "The laser says you are 2mm short; the photo shows the holes are in the wrong place." The Draftsman then fixes the code, and the Foreman tries again.

The "Double-Loop" Safety Net

The system has two safety nets working together:

The Inner Loop: Fixes "Can I run this code?" errors. (e.g., "You used a tool that doesn't exist.")
The Outer Loop: Fixes "Is this the right object?" errors. (e.g., "You built a chair, but I asked for a table, even though the code ran fine.")

Why This Matters (The Results)

The researchers tested this team against a "solo" AI (the old way) using 100 different design challenges, from simple blocks to complex engine parts.

The Solo AI: Failed to build anything 5% of the time. When it did build something, it was often slightly off (like a shelf that wobbles).
The CADSmith Team:
- 100% Success Rate: They never failed to produce a buildable object.
- Precision: The difference between the requested size and the built size dropped from being "off by a lot" to being "off by less than a millimeter."
- The "Eye" Factor: When they removed the "Human Eye" (the photos) and only used the "Laser Tape Measure," the team got confused on complex shapes. The numbers looked right, but the shape was wrong. The photos were essential to catch those subtle mistakes.

The Bottom Line

CADSmith is like upgrading from asking a single person to build a house in the dark, to hiring a team with a blueprint, a library of building codes, a robot foreman, and an inspector with both a laser measure and a keen eye.

By combining exact math (the laser) with visual common sense (the eye) and letting them argue with the builder until the job is perfect, CADSmith turns "maybe it works" into "it definitely works." This is a huge step forward for letting AI help engineers and makers create real, usable 3D parts without needing to be coding experts themselves.

CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation

The Team (The Multi-Agent Pipeline)

The "Double-Loop" Safety Net

Why This Matters (The Results)

The Bottom Line

1. Problem Statement

2. Methodology: The CADSmith Pipeline

A. Agent Architecture

B. The Nested Correction Loops

3. Key Contributions

4. Results

5. Significance and Future Work

CADSmith: Multi-Agent CAD Generation with Programmatic Geometric Validation

The Team (The Multi-Agent Pipeline)

The "Double-Loop" Safety Net

Why This Matters (The Results)

The Bottom Line

1. Problem Statement

2. Methodology: The CADSmith Pipeline

A. Agent Architecture

B. The Nested Correction Loops

3. Key Contributions

4. Results

5. Significance and Future Work

More like this

ChartDiff: A Large-Scale Benchmark for Comprehending Pairs of Charts

Working Paper: Towards a Category-theoretic Comparative Framework for Artificial General Intelligence

Towards Computational Social Dynamics of Semi-Autonomous AI Agents

Enhancing Policy Learning with World-Action Model

Mimosa Framework: Toward Evolving Multi-Agent Systems for Scientific Research