Enabling stratified sampling in high dimensions via nonlinear dimensionality reduction

Imagine you are trying to guess the average temperature of a giant, complex ocean. You can't measure every drop of water, so you have to take samples.

The Problem: The "Curse of Dimensions"
If the ocean were just a small bathtub (low dimensions), you could easily divide it into a grid of squares and take one sample from each square. This is called Stratified Sampling. It's like cutting a cake into equal slices; you know you've covered the whole cake, and your guess will be very accurate.

But what if the ocean isn't just a 2D surface, but a 3D volume, or even a 100-dimensional "hyper-ocean" (which happens in complex computer models for weather, finance, or engineering)?
If you try to cut a 100-dimensional cake into a grid, you would need more slices than there are atoms in the universe just to get a few samples per slice. This is the Curse of Dimensionality. Traditional methods fail because you run out of time and money before you can take enough samples.

The Old Way vs. The New Way

Standard Monte Carlo: This is like throwing darts blindly at the ocean map. You might hit the same spot twice and miss the cold currents entirely. It's slow and inaccurate.
The Paper's Solution: Instead of throwing darts blindly or trying to cut a 100D cake into a grid, the authors propose a clever trick: Find the "Spine" of the problem.

The Core Idea: The "NeurAM" (Neural Active Manifold)
Imagine the complex ocean has a hidden, winding river running through it. The temperature changes mostly along this river, while the water in the vast, empty spaces around the river stays mostly the same.

The authors use a special AI tool called NeurAM (Neural Active Manifold) to find this hidden river.

The Detective (AI): The AI looks at the complex model and learns that, despite having 100 inputs, the output (the temperature) actually only cares about one specific combination of those inputs. It's like realizing that the weather in a city depends mostly on the wind direction, even though there are 50 other sensors measuring humidity, pressure, etc.
The Map: The AI compresses the entire 100-dimensional ocean down into a single, 1-dimensional line (the river).
The Slice: Now, instead of trying to slice a 100D cake, you just slice this 1D line into equal pieces. This is easy!
The Back-Projection: You take those slices on the line and "project" them back onto the original 100D ocean. Because the AI found the river, these slices aren't random squares; they are smart, curved shapes that wrap around the interesting parts of the model (the "level sets").

The Analogy: The Mountain Hike
Think of the computer model as a giant, foggy mountain range. You want to know the average height of the terrain.

Old Method: You try to place a grid of 100x100x100... squares over the whole mountain. You get lost in the fog and can't place enough squares.
New Method: You use a drone (the AI) to fly over the mountain and realize that the height changes mostly along a single, winding ridge line. The rest of the mountain is just flat or repetitive.
- You draw a line along that ridge.
- You cut that line into 10 equal segments.
- You send your hikers to sample those 10 specific segments.
- Because you sampled the "spine" of the mountain where the action is, your average height calculation is incredibly accurate, even though you only took a few samples.

Why This is a Big Deal

It Scales: It works for problems with 10, 100, or even 1,000 variables. The "1D line" trick bypasses the complexity of high dimensions.
It's Smart: The slices aren't rigid boxes; they bend and twist to follow the shape of the data.
It Saves Money: In engineering, running a simulation can cost thousands of dollars. This method gets you the same accuracy with far fewer (and cheaper) simulations.
It Works with "Cheap" Models: The paper also shows you can mix this with "low-fidelity" (cheaper, less accurate) models to get even better results, like using a sketch to guide your search for the real painting.

In a Nutshell
The authors found a way to stop fighting the complexity of high-dimensional data. Instead of trying to cover every inch of a massive, multi-dimensional space, they use AI to find the "skeleton" of the problem, slice that skeleton, and use those slices to guide their sampling. It turns an impossible task into a manageable one, saving time, money, and computational power.

Here is a detailed technical summary of the paper "Enabling stratified sampling in high dimensions via nonlinear dimensionality reduction" by Geraci, Schiavazzi, and Zanoni.

1. Problem Statement

The paper addresses the challenge of uncertainty propagation through computationally expensive models with a large number of random inputs ( $d \gg 1$ ). The goal is to estimate the expected value (or higher moments) of a quantity of interest (QoI), $q = \mathbb{E}_\mu[Q(X)]$ , where $X \sim \mu$ in $\mathbb{R}^d$ .

Limitation of Standard Monte Carlo (MC): Standard MC estimators converge slowly ( $O(N^{-1/2})$ ), requiring a massive number of expensive model evaluations to achieve high precision.
Limitation of Traditional Stratified Sampling: While stratified sampling is a powerful variance reduction technique, it suffers from the curse of dimensionality. Creating uniform partitions (strata) in high-dimensional spaces requires an exponential number of strata to maintain resolution, making it computationally intractable.
Limitation of Existing Alternatives:
- Quasi-Monte Carlo (qMC): Requires specific sample sizes (powers of two) and struggles with high dimensions and correlated inputs.
- Latin Hypercube Sampling (LHS): Assumes input independence and loses effectiveness as dimensionality increases.
- Linear Dimensionality Reduction (e.g., Active Subspaces): May fail to capture complex, nonlinear model behaviors.

2. Methodology

The authors propose a novel framework that combines Stratified Sampling with Neural Active Manifolds (NeurAM), a nonlinear dimensionality reduction technique. The core idea is to perform stratification not in the high-dimensional input space, but in a one-dimensional latent space that captures the majority of the model's variability.

Key Components:

Neural Active Manifolds (NeurAM):
- Uses an autoencoder $(E, D)$ where $E: \mathbb{R}^d \to \mathbb{R}$ (encoder) and $D: \mathbb{R} \to \mathbb{R}^d$ (decoder).
- Trains a surrogate model $S: \mathbb{R} \to \mathbb{R}$ on the latent space.
- Minimizes a loss function ensuring:
  - The surrogate approximates the original model ( $Q \approx S \circ E$ ).
  - Projecting a point onto the manifold preserves the output ( $Q(X) \approx Q(D(E(X)))$ ).
  - Points on the manifold are fixed points of the encoder-decoder composition.
- This identifies a 1D manifold $\gamma$ along which the model output varies significantly.
Mapping to Unit Interval:
- The latent variable $Z = E(X)$ has a cumulative distribution function (CDF) $F$ .
- Using the inverse transform sampling principle, the variable $U = F(E(X))$ is uniformly distributed on $[0, 1]$ .
- This creates an invertible transformation between the high-dimensional input space and the unit interval.
Stratification Strategy:
- The unit interval $[0, 1]$ is partitioned into $S$ strata $\{A_s\}$ .
- These strata are mapped back to the original domain $D$ via the inverse CDF and decoder: $D_s = \{x \in D : F(E(x)) \in A_s\}$ .
- Result: The resulting strata in $\mathbb{R}^d$ naturally align with the level sets of the model $Q$ , grouping inputs that produce similar outputs.
Estimator Construction:
- Samples are drawn within each stratum $D_s$ conditioned on the distribution $\mu|_{D_s}$ .
- The estimator is unbiased: $\hat{q}_{sMC} = \sum \mu(D_s) \frac{1}{N_s} \sum Q(x_n^{(s)})$ .
- Allocation: The paper analyzes optimal sample allocation (Neyman allocation) and proportional allocation, proving that variance reduction is guaranteed compared to standard MC.
Heuristic Refinement:
- A heuristic algorithm is proposed to iteratively refine the stratification boundaries in the unit interval. It bisects the interval contributing most to the total variance, further reducing the estimator's variance at a low computational cost.
Multifidelity Extension:
- The method is integrated with Multifidelity Monte Carlo (MFMC).
- Stratification is applied to both high-fidelity and low-fidelity models.
- The authors show that if the correlation between fidelities is high within strata, the combined variance reduction is superior to standard MFMC.

3. Key Contributions

Scalable Stratification: Introduced a methodology to perform stratified sampling in high-dimensional spaces by leveraging nonlinear dimensionality reduction (NeurAM) to map the problem to a 1D unit interval.
Theoretical Guarantees: Proved that the NeurAM-based estimator remains unbiased and that optimal/proportional allocations guarantee variance reduction relative to standard MC.
Error Analysis: Derived upper bounds on the variance showing that the method's performance depends on the accuracy of the NeurAM surrogate (quantified by loss $\epsilon$ ) and the number of strata.
Heuristic Optimization: Developed an iterative algorithm to optimize strata boundaries without solving high-dimensional nonlinear optimization problems.
Multifidelity Integration: Demonstrated how to combine this stratification with MFMC estimators, providing conditions for enhanced variance reduction.

4. Numerical Results

The authors validated the method on several benchmarks:

Low-Dimensional Tests (2D):
- Compared against standard grid-based stratification and Active Subspaces (AS).
- Result: NeurAM-based stratification significantly outperformed linear AS and regular grids, especially as the number of strata increased, due to its ability to follow nonlinear level sets.
High-Dimensional Tests (3D to 100D):
- Tested on models like the Ishigami function, Hartmann flow, Borehole flow, and a Darcy flow problem (100 dimensions).
- Result: The method maintained effective variance reduction in high dimensions where standard stratification and LHS failed or performed poorly.
- Robustness: Even with a relatively inaccurate surrogate model (e.g., 23% error in the 10D g-function), the stratification strategy yielded significant variance reduction.
Comparison with Other Techniques:
- Compared against Latin Hypercube Sampling (LHS) and randomized Quasi-Monte Carlo (qMC).
- Result: While qMC performed well in low dimensions ( $d=5$ ), its performance degraded rapidly as $d$ increased ( $d=20$ ). NeurAM-based stratification remained robust and consistently reduced variance across all tested dimensions.
Multifidelity Performance:
- Combining NeurAM stratification with MFMC resulted in the lowest Mean Squared Error (MSE) among all tested configurations.

5. Significance and Conclusion

This work bridges the gap between the theoretical benefits of stratified sampling and the practical limitations of high-dimensional uncertainty quantification.

Practical Impact: It enables the use of stratified sampling for complex, high-dimensional engineering problems (e.g., fluid dynamics, reservoir simulation) where traditional methods are infeasible.
Data-Driven: The approach is entirely data-driven, requiring no gradient information (unlike Active Subspaces), making it applicable to black-box models.
Flexibility: It can be seamlessly integrated with existing variance reduction frameworks like Multifidelity Monte Carlo.
Future Work: The authors note limitations regarding models with discontinuities or bifurcations (non-simply connected manifolds) and suggest future extensions to multi-dimensional latent spaces using normalizing flows.

In summary, the paper presents a robust, scalable, and theoretically sound framework for variance reduction in high-dimensional uncertainty propagation by transforming the stratification problem into a manageable 1D space via neural networks.

Enabling stratified sampling in high dimensions via nonlinear dimensionality reduction

1. Problem Statement

2. Methodology

Key Components:

3. Key Contributions

4. Numerical Results

5. Significance and Conclusion

More like this

The fourth known primitive solution to a5+b5+c5+d5=e5a^5 + b^5 + c^5 + d^5 = e^5a5+b5+c5+d5=e5

Waring-Goldbach problems for one square and higher powers

Reductification of parahoric group schemes

Sobolev regularity of the symmetric gradient of solutions to a class of ϕ\phiϕ-Laplacian systems

On the approximation of Weierstrass function via superoscillations

The fourth known primitive solution to $a^5 + b^5 + c^5 + d^5 = e^5$

Sobolev regularity of the symmetric gradient of solutions to a class of $\phi$ -Laplacian systems