Enhancing Authorship Attribution with Synthetic Paintings

Imagine you are a detective trying to solve a mystery: Who painted this picture?

Usually, art experts solve this by studying the artist's life, the chemicals in the paint, and the specific way they hold a brush. But today, we want to teach computers to do this. The problem? Computers are like hungry students; they need to see thousands of examples to learn a style. But for many famous painters, we only have a handful of paintings left in the world. It's like trying to teach someone to recognize a specific singer's voice by only letting them hear three songs.

This paper is about a clever trick to solve this "not enough data" problem. Here is the story of how they did it, explained simply.

1. The Problem: The "Empty Classroom"

The researchers wanted to teach a computer to distinguish between seven British painters from the 1700s and 1800s. These painters were like neighbors who lived in the same town, wore similar clothes, and painted similar landscapes. They were very hard to tell apart.

To make it worse, the computer only had 7 to 25 photos of each artist to study. That's like trying to learn a new language by reading just a few pages of a dictionary. The computer kept getting confused.

2. The Solution: The "AI Photocopier"

Instead of waiting for more real paintings to appear (which isn't happening), the researchers decided to invent new ones.

They used a powerful AI tool called Stable Diffusion (specifically a technique called DreamBooth). Think of this AI as a super-talented art student who has studied the real paintings.

The Training: They showed the AI just a few real paintings of "Artist A."
The Prompt: They told the AI, "Draw a painting in the style of Artist A."
The Result: The AI didn't copy the real paintings exactly. Instead, it learned the vibe, the brushstrokes, and the colors, and then created brand new, fake paintings that looked like they were painted by that artist.

They made 100 of these "fake" paintings for each artist. Now, instead of having 10 real examples, the computer had 10 real ones plus 100 fake ones.

3. The Experiment: Mixing the Ingredients

The team ran four different tests to see which "recipe" worked best:

Recipe A (Real Only): The computer studied only the few real paintings. (The "Starving Student" approach).
Recipe B (Fake Only): The computer studied only the AI-generated paintings. (The "Imagination Only" approach).
Recipe C (Fake to Real): The computer studied the fake paintings but was tested on the real ones. (The "Theory vs. Practice" approach).
Recipe D (The Hybrid Mix): The computer studied a mix of real and fake paintings together. (The "Best of Both Worlds" approach).

4. The Results: What Worked?

Here is what they found, using some fun analogies:

The "Fake Only" Surprise: When the computer studied only the AI-generated art, it actually got really good at recognizing the style! It was like a student who memorized the textbook so well they could ace the test. However, this only worked if the test was also fake.
The "Domain Gap" (The Reality Check): When they trained the computer on fake art and then tested it on real art, it stumbled. It's like teaching someone to drive on a video game simulator and then putting them behind the wheel of a real car on a rainy day. The real car felt different.
The Winner (The Hybrid Mix): The best results came from Recipe D. By mixing the few real paintings with the many fake ones, the computer learned the "rules" of the style without getting confused by the lack of data.
- For the artists with the fewest real paintings (the ones in the most trouble), this trick was a lifesaver. Their accuracy jumped significantly.
- For the artists who already had lots of paintings, the fake ones didn't help much. It's like adding extra water to a soup that is already perfectly seasoned; it doesn't make it better, and sometimes it makes it watery.

5. The Catch: The AI's "Bad Habits"

There was one funny glitch. The AI, when generating new paintings, kept making them look "cropped" (cut off at the edges), even though the researchers told it not to.

Why? Because the few real paintings they used to train the AI happened to have a lot of cut-off figures in them. The AI learned, "Oh, this artist likes to cut off the edges!" and copied that bad habit.
Lesson: Garbage in, garbage out. If your training data has flaws, the AI will copy those flaws.

The Big Takeaway

This paper proves that synthetic data is a powerful tool for art authentication, but it works best as a supplement, not a replacement.

Think of it like this: If you are trying to learn a song and you only have one recording, you might struggle. But if you have that one recording plus a bunch of AI-generated covers of the same song, you can finally hear the melody clearly and recognize the artist, even if the AI covers aren't perfect.

In short: When real art is rare, AI-generated art can fill the gaps, helping computers become better art detectives. But we still need the real thing to make sure the AI isn't just making things up!

1. Problem Statement

The paper addresses the challenge of authorship attribution in art history, specifically for a group of seven British artists active in the late 18th and early 19th centuries (Gainsborough Dupont, George Romney, Thomas Gainsborough, George Morland, James Northcote, Thomas Barker, and John Hoppner).

Core Difficulty: These artists share a similar historical context, geographical location, and artistic style, making visual differentiation difficult even for human experts.
Data Scarcity: High-quality digitized datasets for these artists are extremely limited (ranging from only 7 to 25 images per artist) and often imbalanced.
Limitation of Current Methods: Traditional Convolutional Neural Networks (CNNs) and deep learning models typically require large, diverse datasets to generalize effectively, which is not available in this domain.

2. Methodology

The authors propose a hybrid approach that integrates generative models (to create synthetic data) with discriminative models (for classification) to overcome data scarcity.

A. Data Preparation & Generation

Dataset: Real images were cropped into $1024 \times 1024$ patches.
Generative Model: The authors used Stable Diffusion fine-tuned via DreamBooth.
- Process: A unique identifier token was assigned to each artist. The model was fine-tuned on 200 cropped images per artist.
- Prompting: A generic prompt ("A full [TOKEN] painting") was used to generate 100 synthetic images per artist. Negative prompts were employed to discourage cropped or incomplete figures, though some bias toward the training data's composition persisted.
Sampling Strategies:
- M1: Baseline sampling with moderate overlap.
- M2: Denser sampling (doubling patches along both axes) to capture finer stylistic cues, at the cost of higher redundancy.

B. Feature Extraction & Classification

Embedding Extraction: Instead of using raw pixels, the system extracts visual embeddings from image patches ( $224 \times 224$ $224 \times 224$ ) using three state-of-the-art architectures:
1. MaxViT: Combines convolutional layers and transformers.
2. BEiT v2: Based on masked image modeling.
3. VOLO: Enhances spatial attention.
Classifier: The concatenated embeddings from the three models were fed into a LightGBM (Gradient Boosting) classifier. This lightweight model was trained per artist for binary classification (Authentic vs. Non-authentic).

C. Experimental Design

Four experimental setups were tested to evaluate the impact of synthetic data:

Real-Only: Baseline using only real paintings.
Synthetic-Only: Training and testing exclusively on synthetic data (assessing upper-bound performance).
Synthetic $\to$ Real: Training on synthetic data, testing on real data (assessing domain shift/generalization).
Hybrid (Real + Synthetic): Training on a combination of real and synthetic data, tested on real data. This included both M1 and M2 sampling variants.

3. Key Contributions

Hybrid Framework: Demonstrated a viable pipeline for combining diffusion-generated synthetic art with real data to train robust classifiers in low-data regimes.
Few-Shot Amplification: Showed that synthetic data acts as an effective "few-shot amplifier," significantly boosting performance for artists with very small datasets (e.g., 7–9 images).
Architectural Integration: Successfully integrated embeddings from MaxViT, BEiT v2, and VOLO into a LightGBM classifier, proving that feature-level fusion outperforms single-model approaches in this context.
Sampling Analysis: Investigated the trade-off between patch density (M1 vs. M2), finding that denser sampling generally improves generalization for scarce data.

4. Results

Performance Gains: The Hybrid-M2 configuration (Real + Synthetic with dense sampling) consistently outperformed the Real-Only baseline.
- ROC-AUC: Improved significantly for artists with limited data. For example, Gainsborough Dupont (GD) improved from 0.8746 (Real-Only) to 0.9756 (Hybrid-M2).
- Accuracy: Similarly, GD's accuracy rose from 0.9573 to 0.9803.
Artist Variability:
- High Benefit: Artists with fewer real samples (GD, JH) saw the most dramatic improvements.
- Low/Negative Benefit: Artists with larger datasets (TG, GM) saw marginal gains. Thomas Barker (TB) showed a significant drop in performance when synthetic data was introduced, suggesting the generative model failed to capture his specific stylistic nuances (a domain gap).
Domain Gap: The "Synthetic $\to$ Real" experiment yielded the lowest scores for some artists (e.g., TB dropped to 0.5354 ROC-AUC), highlighting that synthetic images alone cannot replace real data for generalization due to stylistic discrepancies.
Synthetic-Only: Surprisingly, this setup achieved the highest overall metrics (ROC-AUC > 0.98), suggesting that when training and testing distributions are perfectly aligned (both synthetic), the model learns the "synthetic style" very well, though this does not translate directly to real-world applicability without the hybrid approach.

5. Significance and Conclusion

Practical Application: This work provides a solution for art authentication in data-scarce scenarios, a common issue in cultural heritage where digitized collections are small and fragmented.
Strategic Insight: Synthetic data is most effective not as a replacement for real data, but as a complementary regularizer. It broadens stylistic coverage and balances classes, particularly for artists with fewer than 15 real examples.
Limitations & Future Work: The study acknowledges that the quality of synthetic data depends heavily on the fidelity of the generative model. If the model fails to capture specific stylistic traits (as seen with Thomas Barker), it can introduce noise. Future work should focus on adaptive sampling (generating more synthetic data for artists with fewer real samples) and improving generative models to reduce the domain gap between synthetic and real art.

In summary, the paper validates that DreamBooth-fine-tuned Stable Diffusion, when combined with ensemble feature extraction and LightGBM, can significantly enhance authorship attribution accuracy for difficult, low-data art classification tasks.