Unmixing microinfrared spectroscopic images of cross-sections of historical oil paintings

Imagine you have a very old, precious painting, like the famous Ghent Altarpiece. Over hundreds of years, the paint has cracked, faded, and changed. To understand how it was made and how to fix it, scientists take a tiny, microscopic slice of the painting (a "cross-section"). This slice is like a layered sandwich: the top layer might be varnish, the middle layer paint, and the bottom layer the canvas.

Inside this tiny slice, there are dozens of different ingredients mixed together: pigments, oils, binders, and even new chemicals formed by age (like rust or soap).

The Problem: The "Smoothie" Mystery

Scientists use a special machine called ATR-µFTIR to look at this slice. This machine shines infrared light on every single tiny dot (pixel) of the slice and records a "fingerprint" (a spectrum) of what's there.

However, there's a catch. In a real painting, the layers are messy. At any single dot, the machine doesn't just see "Red Paint." It sees a smoothie made of Red Paint, Blue Paint, Oil, and maybe some dust or water vapor from the air.

The machine records this messy mix as a single, complex signal. The goal of the scientists is to "unmix" this smoothie. They want to figure out:

What were the original ingredients? (The Endmembers)
Where exactly are they located in the slice? (The Abundance Maps)

Traditionally, experts had to do this by hand, comparing the messy signals to a library of known chemicals. It was slow, subjective, and often missed things.

The Solution: A Smart AI Detective

The authors of this paper built a new AI tool called FTIR-unmixer. Think of it as a super-smart detective that can look at the messy smoothie and automatically separate the ingredients.

Here is how it works, using simple analogies:

1. The "Patch" Strategy (Looking at Neighborhoods)

Instead of looking at one tiny dot in isolation, the AI looks at a small neighborhood of dots (a 5x5 patch) at a time.

Analogy: Imagine trying to identify a person in a crowd. If you look at one face, it's hard. But if you look at the whole group and see who is standing next to whom, it becomes much easier to spot patterns. The AI uses this "neighborhood" logic to understand that if a certain chemical is here, it's likely nearby too.

2. The "Autoencoder" (The Compression Game)

The AI uses a neural network called an Autoencoder.

Analogy: Imagine you have a huge, messy suitcase full of clothes (the data). You want to pack it into a tiny, neat box (the ingredients).
- The Encoder squishes the messy suitcase down to figure out the essential items inside (the ingredients).
- The Decoder tries to unpack those items to rebuild the suitcase.
- If the AI can rebuild the suitcase perfectly, it knows it found the right ingredients.

3. The "Weighted Spectral Angle" (Ignoring the Noise)

This is the paper's biggest innovation. The data from the machine is full of "noise."

The Problem: Sometimes, the machine picks up signals from the air (like Carbon Dioxide or water vapor) or glitches in the sensor. These are like static on a radio. If the AI listens to the static, it might think the static is a new ingredient, which ruins the analysis.
The Fix: The authors created a special rule called WSAD (Weighted Spectral Angle Distance).
- Analogy: Imagine you are trying to hear a conversation in a noisy room. You naturally tune out the sound of the air conditioner or the traffic outside because you know they aren't part of the conversation.
- The AI does the same thing. It automatically checks every "channel" of data. If a channel looks weird, spiky, or too flat (like the air pollution), the AI turns down the volume on that channel. It only listens closely to the channels that look like real chemical signals.

The Result: Cleaning Up the Altarpiece

The team tested this on a real slice from the Ghent Altarpiece.

Without the new rule (Standard AI): The AI got confused by the air pollution (CO2) and thought it was a real part of the painting. The map of where the "metal soap" (a degradation product) was located looked a bit scattered and messy.
With the new rule (WSAD): The AI ignored the air pollution. The map of the metal soap became much clearer and matched what experts expected. It successfully separated the "ingredients" (like proteins, metal soaps, and calcium oxalates) from the background noise.

Why This Matters

This isn't just about math; it's about saving history.

Speed: It does in minutes what used to take experts hours or days.
Accuracy: It removes human bias and doesn't get tricked by air pollution or machine glitches.
Discovery: It can find hidden layers or tiny amounts of chemicals that a human eye might miss, helping conservators understand exactly how to restore these priceless artworks without damaging them.

In short, the authors built a smart, noise-canceling AI that can look at a messy, ancient paint slice and perfectly separate the original ingredients, helping us understand and preserve our cultural heritage.

Here is a detailed technical summary of the paper "Unmixing microinfrared spectroscopic images of cross-sections of historical oil paintings."

1. Problem Statement

The paper addresses the challenge of interpreting Attenuated Total Reflection Fourier Transform Infrared (ATR-µFTIR) hyperspectral images (HSI) of historical oil painting cross-sections.

Context: ATR-µFTIR is a standard non-invasive technique in heritage science used to analyze the stratigraphy (layers) of paintings. It generates a hyperspectral cube where each pixel contains a spectrum (often >1500 bands).
Challenges:
- Complexity: Samples are heterogeneous, multi-layered, and degraded. Pixel spectra are mixtures of many species (pigments, binders, varnishes, degradation products like metal soaps).
- Noise and Artifacts: The data suffers from significant acquisition noise and atmospheric interference (e.g., $H_2O$ and $CO_2$ absorption) which creates spurious peaks.
- Limitations of Current Methods: Traditional analysis relies on manual comparison with reference libraries, which is slow, subjective, and difficult to scale. Existing automated unmixing algorithms (often designed for VIS-NIR reflectance) fail to handle the specific noise profiles, high dimensionality, and artifact-prone nature of ATR-µFTIR data. Treating all spectral bands equally in training causes noisy regions to bias the estimation of endmembers (pure material spectra) and their abundances.

2. Methodology

The authors propose FTIR-unmixer, an unsupervised deep learning framework based on a Convolutional Neural Network (CNN) Autoencoder designed specifically for blind spectral unmixing of ATR-µFTIR data.

A. Linear Mixing Model (LMM) with Patch-Based Modeling

Assumption: The data adheres to the Linear Mixing Model (Beer-Lambert law), where an observed spectrum is a linear combination of pure endmember spectra weighted by their abundances.
Patch-wise Approach: Instead of treating pixels independently, the method processes $p \times p$ spatial patches. This exploits spectral-spatial regularities, assuming that neighboring pixels share similar material compositions.
Architecture:
- Encoder: A CNN that takes a spectral patch as input and outputs an abundance map ( $A_c$ ). It uses convolutional layers, batch normalization, dropout, and a scaled softmax activation to enforce Abundance Non-negativity (ANC) and Abundance Sum-to-One (ASC) constraints.
- Decoder: A linear CNN layer that reconstructs the input patch from the abundance maps. The weights of this decoder are interpreted as the Endmember Matrix ( $E$ ). The model enforces Endmember Non-negativity (ENC) via a softplus activation function.

B. Key Innovation: Weighted Spectral Angle Distance (WSAD)

To address the issue of unreliable spectral bands (noise, artifacts), the authors introduce a novel loss function, WSAD, which replaces the standard Spectral Angle Distance (SAD).

Concept: WSAD applies element-wise weights ( $w$ ) to the spectra before calculating the angle, effectively down-weighting noisy or uninformative bands during training.
Automatic Weight Estimation: The weights are derived directly from the data using three statistical measures to identify "bad" bands:
1. Spatial Flatness: Bands with unusually low spatial variance (indicating common-mode atmospheric effects rather than material variation) are down-weighted.
2. Neighbour Agreement: Bands that show low correlation with adjacent spectral neighbors (indicating isolated spikes or glitches) are penalized.
3. Spectral Roughness: Bands with high-frequency curvature (spikes) in the median spectrum are suppressed.
Mechanism: These diagnostics are converted into outlier scores and mapped to weights via a sigmoid function. Unreliable bands receive weights close to a minimum floor ( $w_{min}$ ), while informative bands receive weights close to 1.

3. Key Contributions

First Automated Unmixing for ATR-µFTIR Cross-Sections: This is the first work proposing an automated, deep learning-based unmixing method specifically tailored for the high-dimensional, noisy data of painting cross-sections.
WSAD Loss Function: The introduction of a data-driven, weighted loss function that automatically identifies and suppresses the influence of atmospheric artifacts (like $CO_2$ ) and acquisition noise without manual filtering.
Patch-Based CNN Autoencoder: A novel architecture that leverages local spatial structure to improve the estimation of abundance maps compared to pixel-wise approaches.
Robustness to Unknown Endmember Count: The method includes a strategy to iteratively determine the optimal number of endmembers ( $K$ ) by analyzing map separation and duplication.

4. Experimental Results

Dataset: The method was tested on ATR-µFTIR cross-sections from the Ghent Altarpiece (attributed to the Van Eyck brothers). The dataset consisted of concatenated $64 \times 64 \times 1504$ tiles.
Setup: The model was trained for 500 epochs using an Adam optimizer. The number of endmembers was determined to be $K=10$ through an iterative inspection of abundance maps.
Comparison: The proposed WSAD variant was compared against a standard SAD baseline (where all bands are weighted equally).
Findings:
- Chemical Identification: Both methods successfully identified key components: proteins, metal soaps, and calcium oxalates.
- Artifact Suppression: The WSAD method significantly reduced the influence of $CO_2$ contamination (peaks around $2350 \text{ cm}^{-1}$) in the estimated endmember spectra. In the SAD baseline, these artifacts appeared as prominent features in the reconstructed spectra.
- Spatial Coherence: The abundance maps generated by WSAD showed improved spatial coherence, particularly for metal soaps, appearing more uniform and consistent with reference qualitative maps compared to the SAD baseline.
- Interpretability: The WSAD approach successfully separated minor constituents and background resin while minimizing the "leakage" of noise into the chemical signatures.

5. Significance

This work represents a significant advancement in heritage science and computational spectroscopy.

Non-Invasive Analysis: It enables more accurate, automated, and scalable analysis of fragile historical artifacts without the need for destructive sampling or extensive manual intervention.
Methodological Transfer: It bridges the gap between remote sensing hyperspectral unmixing techniques and the specific, challenging requirements of cultural heritage materials (which differ significantly from natural landscapes or standard reflectance data).
Future Impact: By providing a robust, unsupervised tool for blind unmixing, this method allows conservators and art historians to rapidly map complex chemical distributions (e.g., degradation products vs. original pigments), aiding in better conservation strategies and historical understanding.