Wavelet-based estimation in aggregated functional data with positive and correlated errors

Imagine you are a detective trying to solve a mystery, but you don't have the individual clues. Instead, you only have a big, messy pile of evidence where everything is mixed together. Your job is to figure out what each individual piece looked like before they were thrown into the pile.

This is exactly the problem the authors of this paper are tackling, but instead of a crime scene, they are looking at data.

The Big Picture: The "Smoothie" Problem

In the real world, we often measure things that are actually a mix of several different things.

Chemistry: Imagine you have a glass of "fruit punch" (the aggregated data). You want to know exactly how much strawberry, how much orange, and how much grape juice is in it, without being able to taste them separately.
Electricity: Imagine looking at the total power usage of a whole city. You want to figure out the specific usage patterns of just the factories, just the schools, and just the homes, all mixed together in one big number.

The authors call this "Aggregated Functional Data." They want to reverse-engineer the "smoothie" to find the original "fruits."

The Old Way vs. The New Way

Previously, scientists tried to solve this by treating the data like a list of numbers (multivariate statistics). But this is like trying to understand a song by looking at a spreadsheet of sound frequencies; you miss the melody.

Later, they tried using "splines" (mathematical curves that bend smoothly). This works great for smooth, gentle hills, but it fails miserably if the data has sharp spikes, sudden jumps, or weird wiggles. It's like trying to draw a jagged lightning bolt with a piece of soft clay; the clay just won't hold the sharp edges.

The Solution: The Wavelet "Zoom Lens"
The authors propose using Wavelets. Think of a wavelet as a magical zoom lens.

If you look at a curve from far away, you see the big picture.
If you zoom in, you can see the tiny, sharp details (like a sudden spike or a jagged edge).
Wavelets are perfect at capturing both the smooth parts and the sharp, messy parts of a curve simultaneously.

The Two Big Hurdles

The authors didn't just use wavelets; they had to overcome two specific "monsters" that usually break these mathematical models:

1. The "Strictly Positive" Monster (Gamma Errors)
In many real-world measurements (like light absorption or chemical concentrations), the "noise" or error can never be negative. You can't have "negative light."

The Problem: Most math models assume errors are like a bell curve (Gaussian), where mistakes can be positive or negative. When you force a wavelet model to deal with errors that must be positive, the math gets incredibly messy. The errors stop behaving nicely and start "correlating" with each other in the wavelet domain.
The Fix: The authors built a special Bayesian system. Imagine a super-smart detective who doesn't just guess; they use a computer to run millions of simulations (a method called MCMC) to find the most likely answer, even when the rules are weird and the errors are stubbornly positive.

2. The "Connected" Monster (Correlated Errors)
Sometimes, if you make a mistake at one point in time, you're likely to make a similar mistake right after.

The Problem: This is like a ripple in a pond; one wave causes the next. Standard wavelet methods assume every point is independent. When points are connected (like in AR(1) or ARFIMA processes), the standard "shrinkage" (filtering out the noise) doesn't work correctly.
The Fix: They developed a strategy that adjusts the "zoom lens" differently depending on how "connected" the data is. They treat the noise at different levels of detail differently, ensuring they don't accidentally smooth out important signals while trying to remove the noise.

How They Tested It

To prove their method works, they ran a massive simulation lab:

They created fake "smoothies" (aggregated data) using famous test shapes (like "Bumps," "Blocks," and "Doppler" waves).
They added different types of "messy noise" (some positive, some connected).
They tried to separate the ingredients.

The Results:

Their method was very good at finding the sharp edges and spikes that other methods missed.
It handled the "positive-only" noise better than anyone else had before.
Even when the noise was "connected" (correlated), the method remained stable, only getting slightly less accurate, but never breaking down.
It performed slightly better than the current "gold standard" method (Johnstone and Silverman), especially in the hardest scenarios.

The Takeaway

This paper is like giving a detective a new, super-powered toolkit. They can now take a messy, mixed-up signal that contains sharp spikes and weird, non-standard noise, and successfully separate it back into its original, clean components. This is a huge step forward for fields like chemistry, spectroscopy, and energy monitoring, where understanding the individual parts of a mixture is crucial.

1. Problem Statement

The paper addresses the statistical challenge of Aggregated Functional Data Analysis (AFDA). The core problem is estimating individual constituent curves ( $\alpha_l(t)$ ) from observations of their aggregated sum ( $A(t)$ ).

Mathematical Formulation: The observed aggregated curve is modeled as a convex linear combination of $L$ unknown component functions:
$A(t) = \sum_{l=1}^{L} y_l \alpha_l(t) + \epsilon(t)$
where $y_l$ are known weights (concentrations) summing to 1, and $\epsilon(t)$ is a random error process.
Context: This arises in fields like chemometrics (estimating absorbance curves of chemical constituents via the Beer–Lambert law) and energy modeling (decomposing regional electricity load curves).
Limitations of Existing Methods:
- Traditional multivariate methods (e.g., PCR, PLS) ignore the intrinsic functional structure of the data.
- Existing functional methods (e.g., spline-based) struggle with curves containing local features like discontinuities, sharp peaks, or oscillations.
- Most current models assume additive Gaussian errors, whereas real-world data often exhibits strictly positive noise (e.g., in spectroscopy) or correlated noise (short- or long-memory dependence).

2. Methodology

The authors propose a Bayesian wavelet-based framework to estimate component functions under two specific non-standard error scenarios.

A. Wavelet Transformation

The discrete version of the model is transformed from the time domain to the wavelet domain using the Discrete Wavelet Transform (DWT).

Time Domain: $A = \alpha y + \epsilon$
Wavelet Domain: $D = \Theta y + \varepsilon$ $D = Θ y + ε$
- $D$ : Empirical wavelet coefficients of the aggregated data.
- $\Theta$ : Unknown wavelet coefficients of the components.
- $\varepsilon$ : Transformed errors.
Estimation Strategy: The authors apply a wavelet shrinkage rule $\delta(\cdot)$ to denoise the coefficients. The estimator for the component coefficients is derived as:
$\hat{\Theta} = \delta(D)y'(yy')^{-1}$
The final curve estimate $\hat{\alpha}$ is obtained via the Inverse Discrete Wavelet Transform (IDWT).

B. Scenario 1: Strictly Positive (Gamma) Errors

Challenge: When errors follow a Gamma distribution (strictly positive), the DWT destroys the independence property. The transformed errors in the wavelet domain become correlated, and the distribution is no longer preserved.
Solution:
- Prior: A mixture prior is used for wavelet coefficients: a point mass at zero (for sparsity) and a centered logistic distribution.
- Posterior Inference: Because the likelihood and prior do not yield a closed-form posterior expectation, the authors employ Markov Chain Monte Carlo (MCMC) methods.
- Algorithm: They utilize the Robust Adaptive Metropolis (RAM) algorithm to sample from the joint posterior distribution of the coefficients, allowing for joint estimation rather than independent coefficient-by-coefficient shrinkage.

C. Scenario 2: Correlated Errors (AR(1) and ARFIMA)

Challenge: Errors follow Autoregressive (AR(1)) or Autoregressive Fractionally Integrated Moving Average (ARFIMA) processes, introducing short- and long-range dependence.
Solution:
- Leveraging the decorrelation property of the DWT, the authors apply shrinkage rules individually to coefficients at each resolution level.
- Level-Dependent Shrinkage: The standard deviation of the noise is estimated at each resolution level $j$ using the Median Absolute Deviation (MAD) estimator.
- Bayesian Shrinkage: A Bayesian shrinkage rule (based on Sousa and Zevallos, 2025) is applied, which accounts for the varying noise levels across scales.

3. Key Contributions

Novel Error Modeling: The paper extends AFDA to handle strictly positive additive noise (Gamma distribution), a scenario rarely addressed in literature due to inferential complexity.
Joint Bayesian Inference: It introduces a method to handle the loss of independence in the wavelet domain under Gamma noise via MCMC (RAM algorithm), moving beyond standard independent thresholding.
Robustness to Dependence: It demonstrates that wavelet-based estimation remains effective even when errors exhibit short-term (AR) or long-term (ARFIMA) memory, provided level-dependent shrinkage is used.
Handling Local Features: By utilizing wavelet bases, the method successfully recovers curves with discontinuities, peaks, and oscillations, outperforming spline-based approaches in these specific contexts.

4. Results

The authors conducted extensive simulation studies using standard Donoho-Johnstone test functions (Bumps, Blocks, Doppler, Heavisine) and real-data applications.

Gamma Error Scenario:
- Performance (measured by Mean Squared Error - MSE) deteriorates as the number of components ( $L$ ) increases, which is expected due to the ill-posed nature of the inverse problem.
- Higher Signal-to-Noise Ratios (SNR) significantly improved recovery accuracy.
- The method successfully recovered local features despite the non-Gaussian noise.
Correlated Error Scenario:
- The method proved robust to both AR(1) and ARFIMA errors.
- While correlated errors increased the MSE compared to independent Gaussian noise, the increase was moderate (typically 3–4 times the ideal case, but small in absolute numerical terms).
- Comparison: The proposed Bayesian shrinkage rule slightly outperformed the classical "Universal Thresholding" method (Johnstone & Silverman), particularly in challenging long-memory (ARFIMA) settings.
Real Data: The paper mentions applications to real data (Section 5), confirming the practical viability of the approach, though specific numerical results for real data are summarized in the conclusion rather than detailed tables in the provided text.

5. Significance

This work is significant for several reasons:

Bridging Theory and Practice: It addresses a gap between theoretical functional data analysis and practical applications where noise is rarely Gaussian or independent (e.g., chemical spectroscopy).
Computational Innovation: It provides a concrete computational framework (RAM algorithm) for solving complex Bayesian inference problems in wavelet domains where analytical solutions are intractable.
Methodological Advancement: It validates that wavelet-based approaches are superior to spline-based methods for aggregated data containing sharp local features, even under complex noise structures.
Publication Potential: The authors highlight that modeling aggregated curves with positive Gamma errors is an under-explored area, offering a new direction for future research in chemometrics and signal processing.