Spread them Apart: Towards Robust Watermarking of Generated Content

Imagine you just bought a high-end 3D printer that can create incredibly realistic paintings, sculptures, and photos. It's so good that you can't tell the difference between a human artist's work and what your machine spits out.

Now, imagine a problem: A dishonest person uses your machine to print a fake masterpiece, claims they painted it themselves, and sells it. Or, someone uses it to create a "deepfake" of a celebrity saying something they never said. How do you prove who actually made it?

This paper introduces a solution called "Spread them Apart." Think of it as a clever, invisible security tag that gets baked into the image while it is being created, rather than stamped on top afterward.

Here is the breakdown using simple analogies:

1. The Problem: The "Perfect Forgery"

Generative AI (like the ones that make images from text) has gotten so good that fake images look real.

The Risk: Without proof, anyone can take an AI image, claim it's theirs, and break copyright laws. Or, they could try to scrub away any hidden proof that it was AI-made.

2. The Solution: The "Invisible DNA"

The authors propose a method to embed a digital watermark directly into the image's "DNA" during the creation process.

How it works: When you ask the AI to make a picture, the system doesn't just make the picture; it secretly tweaks the internal math of the image to fit a specific pattern unique to you (the user).
The Analogy: Imagine a baker making a cake. Instead of just baking it, they arrange the sprinkles inside the cake in a specific, secret pattern that only they know how to read. If someone tries to eat the cake (or cut it up), the pattern is still there inside.

3. The Secret Sauce: "Spread Them Apart"

This is the core trick of the paper.

The Old Way: Usually, watermarks are like a faint signature on the surface of a painting. If you wash the painting or change the lighting, the signature might fade.
The New Way: The authors use a strategy called "Spread them Apart."
- Imagine you have two specific pixels (tiny dots of color) in the image. Let's call them Pixel A and Pixel B.
- To hide a "1" in your secret code, the AI makes sure Pixel A is slightly brighter than Pixel B.
- To hide a "0", it makes Pixel B slightly brighter than Pixel A.
- The Magic: The AI ensures the difference between them is big enough that even if someone tries to blur the image, change the brightness, or add noise, Pixel A will still be brighter than Pixel B. The relationship is "spread apart" enough to survive the attack.

4. Why It's Hard to Remove (The "Unbreakable Seal")

The paper proves mathematically that this method is robust against common tricks people use to hide watermarks:

Brightness/Contrast: If someone turns up the brightness, both pixels get brighter, but the difference between them stays the same.
Flipping Colors: If someone turns the image into a negative (black becomes white), the relationship flips, but the system knows to look for the flip.
Adversarial Attacks: Even if a hacker tries to use a super-computer to specifically target and erase this pattern, the math shows it's incredibly difficult to do without ruining the image itself.

5. The "Three-Way" Backup Plan

To make it even stronger, the paper suggests a "belt and suspenders" approach.

Instead of just hiding the secret in the pixels, they also hide it in the mathematical "shape" of the image (using frequency patterns).
The Analogy: It's like writing a secret message in three places:
1. On the surface of a rock.
2. Inside the rock's crystal structure.
3. In the shadow the rock casts.
- Even if someone chips off the surface (pixel attack) or melts the rock (geometric attack), the message might still be recoverable from the shadow or the crystal structure.

6. The Result: Who Made It?

When a suspicious image pops up, the owner of the AI system can run a "decoder":

They look at the secret pairs of pixels.
They check the pattern (Is A brighter than B? Or B brighter than A?).
They reconstruct the secret code.
The Verdict: If the code matches User #42, they know User #42 generated it. If the code is gibberish, they know it wasn't made by their system.

Summary

"Spread them Apart" is a way to bake a permanent, unremovable ID card into AI-generated images. It doesn't require retraining the AI; it just tweaks the final steps of creation to ensure that a secret relationship between pixels is strong enough to survive almost any attempt to scrub it away. It's the difference between a sticker on a car (easy to peel off) and the car's VIN number stamped into the frame (hard to remove without destroying the car).

1. Problem Statement

The rapid advancement of generative models, particularly Diffusion Models (e.g., Stable Diffusion), has made it increasingly difficult to distinguish between real and AI-generated images. This capability raises two critical ethical and legal challenges:

Content Verification: The inability to automatically verify if a digital object is artificially generated (e.g., deepfakes).
User Attribution: The risk of dishonest users claiming exclusive copyright over generated content or violating license agreements by removing ownership traces.

Existing watermarking methods often require retraining the generative model (end-to-end training) or lack robustness against specific post-processing attacks (e.g., geometric transformations, adversarial attacks). The authors propose a solution that embeds watermarks during the inference phase without requiring model retraining, ensuring robustness against both additive and multiplicative perturbations.

2. Methodology: "Spread them Apart"

The core framework, named Spread them Apart, embeds digital watermarks by optimizing the latent representation of the image during generation.

A. Core Mechanism (Pixel Level)

Instead of injecting noise or modifying frequency components directly, the method enforces specific inequalities between pairs of pixel intensities based on a user's secret key.

Key Generation: Upon registration, a user $u_i$ is assigned a public watermark $w(u_i)$ (a binary string) and a private secret $s(u_i)$ (a set of index pairs $\{a_j, b_j\}$ ).
Embedding Rule: For each bit $w_j$ $w_{j}$ of the watermark:
- If $w_j = 0$ , the model ensures pixel intensity $x_{a_j} \ge x_{b_j}$ .
- If $w_j = 1$ , the model ensures $x_{a_j} < x_{b_j}$ .
Optimization: During the diffusion inference process, the latent vector $z$ $z$ is optimized to minimize a custom loss function $\mathcal{L}$ $L$ :
$\mathcal{L} = \lambda_{wm}\mathcal{L}_{wm} + \lambda_{qual}\mathcal{L}_{qual}$
- $\mathcal{L}_{wm}$ : Penalizes violations of the pixel inequality constraints defined by the secret key.
- $\mathcal{L}_{qual}$ : Preserves image quality using the LPIPS metric to prevent degradation.
Extraction: To detect a watermark, the owner uses the secret key to check the pixel pairs. If the inequality holds, the bit is decoded; otherwise, it is flipped.

B. Robustness Extension (Frequency Domain)

To handle geometric transformations (rotation, translation) that disrupt pixel positions, the authors extend the method to embed watermarks simultaneously in:

Pixel Space: As described above.
Translation Invariants ( $\gamma_t$ ): Using the magnitude of the Fourier Transform (Theorem 1), which is invariant to translation.
Rotation Invariants ( $\gamma_r$ ): Using the magnitude of the Fourier-Mellin Transform (Theorem 2), which is invariant to rotation and scaling.

The loss function is augmented to enforce constraints on these invariant features, creating a "3-watermark" system (Pixel + Translation Invariant + Rotation Invariant).

C. Attribution Logic

The system uses a double-tail detection rule. An image is attributed to user $u_i$ if the Hamming distance between the extracted watermark and the user's assigned watermark is either very small (high match) or very large (high mismatch, effectively a "flipped" match). This design specifically counters bit-flipping attacks.

3. Key Contributions

Inference-Time Embedding: The method embeds watermarks during the generation process without requiring retraining of the generative model, making it applicable to pre-trained models like Stable Diffusion.
Theoretical Robustness Guarantees:
- Additive Perturbations: The authors prove that the watermark is robust against additive noise of bounded magnitude ( $\|\epsilon\|_\infty < \Delta$ ).
- Multiplicative/Exponential Attacks: The method is inherently robust to multiplicative perturbations (e.g., contrast changes, gamma correction) because the relative ordering of pixel intensities ( $x_a$ vs $x_b$ ) is preserved under these operations.
Geometric Robustness: By utilizing Fourier and Fourier-Mellin invariants, the method achieves robustness against rotation and translation, which typically break pixel-based watermarking schemes.
Dual Functionality: The framework simultaneously solves content detection (is this AI-generated?) and user attribution (who generated it?).

4. Experimental Results

The method was evaluated on Stable Diffusion v2 using the DiffusionDB dataset (1,000 images, 100-bit watermarks). It was compared against state-of-the-art methods: Stable Signature, SSL, AquaLora, and WOUAF.

Robustness to Attacks:
- Pixel/Intensity Attacks: The proposed method achieved near-perfect robustness (Average Bit-Wise Error < 0.003) against brightness, contrast, gamma correction, and sharpening.
- Adversarial Attacks: It demonstrated superior resistance to white-box PGD adversarial attacks (Error: 0.064) compared to competitors (e.g., Stable Signature: 0.487).
- Geometric Attacks: The extended "STA(3)" version (using invariants) maintained high True Positive Rates (TPR > 0.95) against rotation and translation, whereas the baseline pixel-only method failed completely on these.
Attribution Accuracy:
- The method achieved a TPR of 1.000 for most attacks in the attribution task.
- It maintained a False Positive Rate (FPR) of $10^{-6}$ , ensuring high security against false attribution.
Comparison: The method outperformed or matched all baselines in robustness while supporting a longer watermark length (100 bits vs. 30–48 bits in competitors).

5. Significance

This paper presents a significant step forward in the governance of generative AI. By proving that watermarks can be embedded without retraining and remain robust against multiplicative and geometric attacks, the "Spread them Apart" framework offers a practical, deployable solution for:

Copyright Protection: Allowing model owners to trace generated content back to specific users, preventing unauthorized copyright claims.
Content Authenticity: Providing a reliable mechanism to detect AI-generated deepfakes and synthetic media.
Regulatory Compliance: Offering a technical standard that satisfies emerging legal requirements for labeling and tracking AI-generated content.

The theoretical guarantee of robustness against bounded additive noise and the design-based robustness to multiplicative changes make this approach particularly resilient against common image manipulation techniques used to remove watermarks.