Regularizing INR with diffusion prior self-supervised 3D reconstruction of neutron computed tomography data

Imagine you are trying to solve a massive, 3D jigsaw puzzle, but someone has stolen 90% of the pieces. You only have a few scattered clues left. Your goal is to figure out what the complete picture looks like.

This is exactly the challenge scientists face when using Neutron Computed Tomography (CT) to look inside objects like concrete, batteries, or fuel cells. Neutrons are great at seeing through materials that block X-rays, but they are "lazy" and slow. To get a clear picture, you usually need to take thousands of photos (views) from every angle. But sometimes, you can't wait that long, or the equipment is too weak. So, you end up with very few photos—maybe just 5 or 9 instead of thousands.

When you try to rebuild the 3D image from so few photos using old-school math, the result is a blurry, distorted mess full of "ghosts" and streaks. It's like trying to guess the shape of a cat based on a single blurry paw print.

The New Solution: "DINR" (The Smart Detective)

The authors of this paper created a new tool called DINR (Diffusive Implicit Neural Representation). Think of DINR as a super-smart detective who doesn't just look at the few clues you have; they also have a massive library of "what things usually look like" in their head.

Here is how DINR works, broken down into simple steps:

1. The Two-Brain Approach

DINR has two "brains" working together:

Brain A (The Architect): This is an Implicit Neural Representation (INR). Imagine a digital artist who can draw a perfect 3D object using a tiny set of instructions (mathematical weights) instead of storing millions of pixels. This artist is great at creating smooth shapes but sometimes gets confused when the clues are too few.
Brain B (The Art Historian): This is a Diffusion Model. Think of this as an AI that has studied millions of 3D images of concrete, rocks, and bubbles. It knows the "texture" and "rules" of how these materials naturally look. It's like an art historian who knows that concrete usually has tiny pores and cracks, not smooth, perfect spheres.

2. The "Restoration" Process

When you give DINR a blurry, incomplete set of photos:

The Architect tries to build a 3D model based only on the few photos you gave it. It's a rough draft.
The Art Historian looks at that rough draft and says, "Hey, that doesn't look right. Real concrete has these tiny details. Let me fix it."
They go back and forth. The Architect adjusts the shape to match the photos, and the Historian adjusts the texture to match reality.
They repeat this dance until the image is sharp, detailed, and matches the few clues you have, without adding fake "ghosts."

Why is this better than the old way?

The Old Way (FBP): This is like trying to fill in a crossword puzzle by guessing every letter randomly. If you miss a few clues, the whole word becomes nonsense. In CT, this creates terrible streaks and blurs.
The "Middle" Way (MBIR): This is like a very careful puzzle solver who uses strict rules (like "edges must be straight"). It's better, but it often misses the tiny, complex details (like the tiny pores in concrete) because it's too rigid.
The DINR Way: This is the best of both worlds. It respects the few photos you have (the data) but uses its "memory" of what real materials look like (the prior) to fill in the missing gaps intelligently.

The Results: Seeing the Invisible

The researchers tested this on concrete. Concrete is full of tiny holes (pores) and cracks that are crucial for safety.

When they only had 4 or 5 views (extremely sparse data), the old methods produced a blurry blob where you couldn't see the holes.
DINR managed to reconstruct the concrete so clearly that you could see the tiny pores and the texture, almost as if they had taken thousands of photos.

The Big Picture

This is a breakthrough because it means we can scan things much faster or with weaker equipment and still get high-quality, detailed 3D images.

For Batteries: We could check if a battery is safe in real-time without waiting hours for a scan.
For Construction: We could inspect bridges or dams for hidden cracks much more easily.
For Science: We can study how water moves through soil or plants without destroying the sample.

In short, DINR is like giving a blurry, low-resolution photo a "magic upgrade" by teaching the computer to use its imagination (based on real-world training) to fill in the missing pieces perfectly. It turns a "good enough" guess into a "scientifically accurate" picture.

Here is a detailed technical summary of the paper "Regularizing INR with Diffusion Prior for Self-Supervised 3D Reconstruction of Neutron Computed Tomography Data."

1. Problem Statement

The paper addresses the critical challenge of Sparse-View Neutron Computed Tomography (CT) reconstruction.

Context: Neutron CT is essential for imaging hydrogen distribution in applications like fuel cells, batteries, and concrete integrity. However, due to low beam flux, acquiring the thousands of views required for standard reconstruction is often impractical, necessitating sparse sampling.
The Challenge: When projection views fall below the Nyquist limit, traditional analytical methods like Filtered Back Projection (FBP) fail, producing severe artifacts.
Limitations of Current Solutions:
- Model-Based Iterative Reconstruction (MBIR): Relies on hand-crafted priors (e.g., Total Variation, qGGMRF). While effective, they struggle with complex microstructures and require extensive manual tuning.
- Implicit Neural Representations (INRs): Offer resolution-independent, memory-efficient 3D reconstruction but suffer from spectral bias (favoring low frequencies), leading to poor high-frequency detail recovery and instability in sparse-data regimes.
- Diffusion Priors: While powerful generative models exist (e.g., DD3IP), they often lack specific integration with coordinate-based neural representations needed for efficient, high-fidelity 3D volume reconstruction.

2. Methodology: Diffusive INR (DINR)

The authors propose DINR (Diffusive INR), a hybrid framework that regularizes Implicit Neural Representations using a Steerable Conditional Diffusion (SCD) prior.

Core Architecture

Implicit Neural Representation (INR): The 3D volume is represented as a continuous function $F_\phi$ (using a SIREN architecture) mapping spatial coordinates to attenuation coefficients.
Diffusion Prior (DD3IP): The framework utilizes a pre-trained 3D Deep Diffusion Image Prior (DD3IP) based on a UNet denoising diffusion model. This model was trained purely on synthetic ellipsoid data, making it an out-of-distribution (OOD) solver for real concrete microstructures.
Regularization Mechanism:
The method formulates a proximal loss function to update the INR weights ( $\phi$ ). The loss combines:
1. Data Fidelity: The Mean Squared Error (MSE) between the forward-projected INR output and the measured sparse projections ( $y$ ).
2. Diffusion Regularization: An MSE term between the current diffusion model estimate ( $\hat{x}_t$ ) and the INR output.
$L_\phi = \text{MSE}(A F_\phi(S, A^*y), y) + \rho \cdot \text{MSE}(\hat{x}_t, F_\phi(S, A^*y))$
- $\rho$ controls the influence of the diffusion prior.
- The INR also accepts the FBP estimate ( $A^*y$ ) as an input to accelerate convergence.

Algorithm Flow (Algorithm 1)

The reconstruction proceeds via an iterative reverse diffusion process:

Initialization: The INR is initialized with $\rho=0$ (standard data fidelity) to get a rough estimate. The diffusion state $x_T$ is initialized using the FBP estimate plus noise.
Iterative Steps ( $t = T \to 1$ ):
- Update Diffusion Weights ( $\theta$ ): The diffusion model weights are adapted to the specific measurement data.
- Diffusion Sampling: A denoised estimate $\hat{x}_t$ is generated using the adapted diffusion model.
- Update INR Weights ( $\phi$ ): The INR is optimized using the proximal loss (Eq. 4) to align with both the measurements and the diffusion estimate $\hat{x}_t$ .
- State Update: The next state $x_{t-1}$ is generated by applying the DDIM sampler to the updated INR output.

3. Key Contributions

Novel Framework (DINR): The first formulation of a regularized INR within a DD3IP framework, successfully combining the resolution independence of INRs with the complex generative power of diffusion priors.
Synthetic-to-Real Transfer: Demonstrated that a diffusion model trained purely on synthetic data can effectively reconstruct real-world neutron CT data of concrete microstructures, overcoming the out-of-distribution challenge.
Implementation Improvements: Enhanced the existing DD3IP codebase by integrating Tomosipo (a Python wrapper for ASTRA-toolbox) for distance-driven parallel beam projection, improving modularity and data consistency enforcement.
Superior Sparse-View Performance: Showed that DINR outperforms state-of-the-art methods (FBP, standard INR, and DD3IP) and matches or exceeds MBIR with qGGMRF priors (a more realistic prior than TV) in ultra-sparse regimes (as few as 5 views).

4. Experimental Results

The method was evaluated on both simulated and experimentally obtained neutron CT data of concrete cylinders.

Metrics: Performance was measured using PSNR and SSIM across varying view counts (4, 8, 16, 32 for synthetic; 5, 9, 17, 33 for real data).
Quantitative Findings:
- Synthetic Data: DINR achieved the highest PSNR/SSIM across all view counts. For example, at 4 views, DINR achieved 26.27 dB PSNR, significantly outperforming FBP (19.31 dB) and standard INR (14.76 dB).
- Real Data: DINR produced comparable or superior quantitative metrics to MBIR and DD3IP.
Qualitative Findings:
- Microstructure Preservation: DINR excelled at preserving high-frequency details (pores and boundaries) in ultra-sparse scans (5–9 views), where MBIR and DD3IP showed blurring or artifacts.
- ROI Analysis: The authors introduced a data-driven ROI selection method to avoid user bias. Results showed DINR significantly outperformed other methods in small ROIs (<48x48 pixels), which are dominated by microstructural features rather than background. This indicates DINR captures fine details better than methods that rely heavily on background statistics.

5. Significance and Future Work

Impact: DINR provides a robust, self-supervised solution for extreme data limitations in neutron CT, enabling accurate microstructural characterization without the need for large, labeled real-world datasets.
Generalizability: The authors note the framework can be extended to other CT modalities (X-ray, Electron CT) and geometries (cone-beam, helical).
Future Directions:
- Conducting ablation studies on the contribution of the FBP input.
- Performing exhaustive parameter searches to further optimize performance against MBIR.
- Scaling the software to multi-GPU implementations to handle larger volumetric datasets and more complex acquisition geometries.