WildSVG: Towards Reliable SVG Generation Under Real-Word Conditions

This paper introduces the WildSVG Benchmark, comprising real-world and synthetic datasets, to address the lack of evaluation resources for SVG extraction and reveals that while current multimodal models struggle with noisy natural images, iterative refinement methods offer a promising path toward reliable performance.

Marco Terral, Haotian Zhang, Tianyang Zhang, Meng Lin, Xiaoqing Xie, Haoran Dai, Darsh Kaushik, Pai Peng, Nicklas Scharpff, David Vazquez, Joan Rodriguez

Published 2026-02-26
📖 4 min read☕ Coffee break read

Imagine you have a beautiful, hand-drawn logo on a piece of paper. Now, imagine that paper is crumpled, stained with coffee, sitting on a busy street, and partially covered by a pigeon's shadow.

The Challenge:
Your goal is to take a photo of this messy, real-world scene and magically turn it back into a perfect, clean, digital drawing file (called SVG) that a graphic designer can edit, resize, and use anywhere.

This is the problem the paper "WildSVG" tackles.

The Problem: The "Clean Room" vs. The "Wild"

For a long time, AI models have been really good at drawing vector graphics, but only if you give them a clean, perfect image (like a photo of a logo on a white background) or a text description like "draw a red apple."

But the real world isn't a clean room. It's "wild."

  • The Wild: Logos are on billboards in the rain, on t-shirts with wrinkles, or on cars moving fast. They are blurry, distorted, and surrounded by clutter.
  • The Failure: When you ask current AI models to look at a messy photo and recreate the logo, they get confused. They might draw the whole background, miss the logo entirely, or create a messy digital sketch that looks nothing like the original.

The Solution: Introducing "WildSVG"

The researchers realized there was no "test" for this specific skill. It's like trying to teach a driver to race in a blizzard, but you've only ever let them practice on a sunny, empty track.

To fix this, they built WildSVG, a new training ground with two types of "driving courses":

  1. The "Natural" Course (Real Life): They took thousands of real photos of company logos from the internet (like a Starbucks cup on a rainy sidewalk) and paired them with the perfect digital version of that logo. This is the "messy" test.
  2. The "Synthetic" Course (Fake but Hard): They took perfect digital logos and used AI to paste them into realistic, messy scenes (like a logo on a crumpled soda can). This lets them test the AI with specific, known difficulties.

The Experiment: Who is the Best Artist?

The researchers put the world's smartest AI models (like GPT-5, Claude, and Gemini) through this test. They asked the AIs: "Look at this messy photo. Ignore the background. Give me the clean digital code for just the logo."

Here's what they found:

  • The "Semantic" Trap: Most AIs are like students who memorized the idea of a logo but can't draw it perfectly. If asked to draw the "Heineken" logo, the AI might write the word "Heineken" in a font that looks kind of like the real one. It gets the "meaning" right (the semantic similarity is high), but the actual drawing is wrong (the pixel accuracy is low).
  • The "StarVector" Glitch: One specific model, trained to be a vector expert, got so confused by the messy background that it tried to draw the entire photo instead of just the logo. It forgot to listen to the instructions!
  • The "Iterative" Hope: The best results came from models that didn't just guess once. They used a "try, check, and fix" method. The AI draws a draft, looks at it, says "Oops, that curve is wrong," and fixes it. This "iterative refinement" is the most promising path forward.

The Verdict

Currently, even the smartest AIs are like novice artists trying to copy a masterpiece while wearing foggy glasses. They can get the general shape and color, but they struggle with the fine details and the messy real-world context.

Why does this matter?
If we can solve this, we could:

  • Turn a photo of a hand-drawn sketch into a professional vector file instantly.
  • Help designers find and edit logos from old, messy archives.
  • Automate the creation of digital assets from the real world.

In short: The paper says, "We built the first real-world test for AI vector drawing, and while the AI is getting smarter, it still has a long way to go before it can reliably turn our messy world into clean, editable digital art."

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →