Leveraging Contrastive Learning for a Similarity-Guided Tampered Document Data Generation Pipeline

This paper proposes a novel data generation pipeline that leverages contrastive learning and auxiliary networks to produce diverse, high-quality tampered document images, thereby overcoming the limitations of existing rule-based methods and significantly improving the performance and generalizability of text forgery detection models.

Mohamed Dhouib, Davide Buscaldi, Sonia Vanier, Aymen Shabou

Published 2026-02-20
📖 5 min read🧠 Deep dive

Imagine you are a master art forger trying to create a fake painting that looks so real, even the experts can't tell the difference. Now, imagine you are a detective trying to catch that forger.

The problem is, to train your detective to spot the fakes, you need to show them thousands of examples of bad forgeries. But here's the catch: real forgeries are rare and hard to find. If you try to make fake forgeries yourself using simple computer rules, they usually look terrible—like a child trying to paste a sticker onto a photo. The edges are jagged, the colors don't match, and the "glue" is visible. If you train your detective on these obvious fakes, they will become lazy. They'll learn to just look for "weird edges" and fail when they see a real professional forgery that looks perfect.

This paper is about building a super-smart factory that can automatically create thousands of "perfect" fake documents to train the best detectives in the world.

Here is how they did it, explained simply:

1. The Problem: The "Bad Copy-Paste" Factory

Previous methods for making fake documents were like using a blunt knife to cut a picture out of a magazine.

  • The Result: The cut was messy. You could see the white paper underneath, or the font looked slightly different.
  • The Consequence: The AI detectives learned to spot these messy cuts, but when a real criminal used a high-end scanner and Photoshop to make a clean fake, the AI was completely fooled.

2. The Solution: Two Specialized "Quality Control" Robots

The authors built a new factory with two special robots (neural networks) that act as quality inspectors before a fake document is ever made.

Robot A: The "Eye for Detail" (The Similarity Network)

Imagine you are trying to replace a paragraph in a letter with a paragraph from another letter.

  • The Old Way: You just grab the text and paste it. If the font is slightly different or the paper is a different shade of white, it looks fake.
  • Robot A's Job: This robot is trained to be a super-observer. Before it lets you paste a piece of text, it checks: "Does this font match the neighbors? Is the background color exactly the same? Is the text aligned perfectly?"
  • The Analogy: Think of it like a matchmaker. It doesn't just look for "text"; it looks for the perfect soulmate for the empty space. It ensures the new text blends in so seamlessly that it looks like it was always there.

Robot B: The "Scissors Expert" (The Bounding Box Network)

Imagine you are cutting a shape out of a piece of paper.

  • The Old Way: You might cut too close, slicing off the top of a letter "A," or you might leave a bit of the neighbor's letter "B" attached. This creates a jagged, obvious scar.
  • Robot B's Job: This robot checks the "cutting lines" (the bounding box). It asks: "Did you cut through the middle of a letter? Did you accidentally include a piece of the next word?"
  • The Analogy: Think of it as a precision surgeon. It ensures the incision is clean and doesn't damage the surrounding tissue. If the cut is messy, the robot rejects it and asks for a better cut.

3. The Factory Process

When the factory wants to create a fake document, it follows a strict routine:

  1. Pick a spot: Find a place in a document to tamper with (e.g., change a date or a name).
  2. Find a candidate: Look for a piece of text from another document that could fit there.
  3. Robot A checks: "Does this text look like it belongs here?" (Checks color, font, lighting).
  4. Robot B checks: "Is the cut clean?" (Ensures no letters are sliced in half).
  5. If both say "Yes": The factory pastes the text. The result is a fake document that looks 100% real to the human eye.
  6. If either says "No": The factory throws it away and tries again.

4. The Result: Super Detectives

The authors used this factory to create 2.8 million high-quality fake documents. They then trained five different AI detective models on this data.

When they tested these detectives on real-world forgeries (made by actual humans, not computers), the results were amazing:

  • The detectives trained on the "perfect fake" data were much better at spotting real crimes.
  • They didn't get tricked by the "weird edge" shortcuts anymore because they had learned what real consistency looks like.

The Big Picture

This paper is a game-changer because it solves the "data scarcity" problem. Instead of waiting for criminals to make forgeries (which is rare and dangerous to collect), we can now simulate them perfectly.

By using these two "Quality Control Robots," the authors created a training ground that is so realistic, it turns average AI detectives into elite forensic experts. It's like upgrading a driving school from a parking lot with cones to a simulated city with real traffic, ensuring the drivers (AI) are ready for the real world.

Get papers like this in your inbox

Personalized daily or weekly digests matching your interests. Gists or technical summaries, in your language.

Try Digest →