stat papers | Gist.Science

Sampling via Stochastic Interpolants by Langevin-based Velocity and Initialization Estimation in Flow ODEs

This paper proposes a novel sampling method for unnormalized Boltzmann densities that leverages a sequence of Langevin samplers to efficiently generate intermediate samples and robustly estimate the velocity field of a probability flow ODE derived from linear stochastic interpolants, offering theoretical convergence guarantees and demonstrating effectiveness in high-dimensional multimodal distributions and Bayesian inference tasks.

Chenguang Duan, Yuling Jiao, Gabriele Steidl, Christian Wald, Jerry Zhijian Yang, Ruizhe ZhangThu, 12 Ma📊 stat

The Bayesian Geometry of Transformer Attention

This paper introduces "Bayesian wind tunnels" to rigorously demonstrate that small transformers perform exact Bayesian inference through a specific geometric mechanism involving residual streams, feed-forward updates, and attention-based routing, thereby establishing a clear architectural advantage over flat models and providing a mechanistic foundation for understanding reasoning in large language models.

Naman Agarwal, Siddhartha R. Dalal, Vishal MisraThu, 12 Ma📊 stat

Absolute indices for determining compactness, separability and number of clusters

This paper introduces novel absolute cluster indices based on defined compactness functions and neighboring point sets to objectively determine cluster compactness, separability, and the true number of clusters, demonstrating their effectiveness across synthetic and real-world datasets compared to existing relative validity indices.

Adil M. Bagirov, Ramiz M. Aliguliyev, Nargiz Sultanova, Sona TaheriThu, 12 Ma📊 stat

An Algorithm to perform Covariance-Adjusted Support Vector Classification in Non-Euclidean Spaces

This paper proposes a Cholesky-decomposition-based algorithm for Covariance-Adjusted Support Vector Classification that overcomes the limitations of traditional max-margin SVMs in non-Euclidean spaces by incorporating class covariance structures into the optimization problem, resulting in superior classification performance across multiple datasets.

Satyajeet Sahoo, Jhareswar MaitiThu, 12 Ma📊 stat

Optimal Transport Aggregation for Distributed Mixture-of-Experts

This paper proposes a communication-efficient distributed learning framework that uses optimal transport to aggregate locally trained Mixture-of-Experts models into a global estimator, achieving performance comparable to centralized training while preserving model structure and minimizing communication costs.

Faïcel Chamroukhi, Nhat Thien PhamThu, 12 Ma📊 stat

Disjunctive Branch-and-Bound for Certifiably Optimal Low-Rank Matrix Completion

This paper introduces a disjunctive branch-and-bound framework with novel convex relaxations to solve low-rank matrix completion problems to certifiable optimality, significantly reducing both optimality gaps and test set errors compared to existing heuristic methods.

Dimitris Bertsimas, Ryan Cory-Wright, Sean Lo, Jean PauphiletThu, 12 Ma📊 stat

The Inverse Problem for Single Trajectories of Rough Differential Equations

This paper establishes a rigorous framework and a convergent numerical algorithm for solving the continuous inverse problem of reconstructing a geometric $p$ -rough path from a discrete observed trajectory by formulating it as a limit of discrete inverse problems driven by piecewise linear paths.

Thomas Morrish, Theodore Papamarkou, Anastasia Papavasiliou, Yang ZhaoThu, 12 Ma📊 stat

Insights into the Relationship Between D- and A-optimal Designs

This paper establishes that the A-optimality criterion can be factored into a D-optimal scale term and a dimensionless sphericity factor dependent on eigenvalue dispersion, thereby explaining performance differences among D-equivalent designs and providing a lightweight method for refining candidate pools.

Andrew T. Karl, Bradley JonesThu, 12 Ma📊 stat

Online LLM watermark detection via e-processes

This paper introduces a unified framework for online LLM watermark detection based on e-processes, which provides anytime-valid statistical guarantees and enhances detection power through empirically adaptive methods applicable to various sequential testing problems.

Weijie Su, Ruodu Wang, Zinan ZhaoThu, 12 Ma📊 stat

Universality of General Spiked Tensor Models

This paper establishes the universality of high-dimensional spectral behavior and statistical limits for asymmetric rank-one spiked tensor models with non-Gaussian noise, demonstrating that the maximum-likelihood estimator's performance matches the Gaussian case under finite fourth-moment assumptions through a combination of resolvent methods, cumulant expansions, and variance bounds.

Yanjin Xiang, Zhihua ZhangThu, 12 Ma📊 stat

Emergence of Distortions in High-Dimensional Guided Diffusion Models

This paper formalizes the loss of diversity in classifier-free guidance as "generative distortion," characterizes its emergence via a high-dimensional phase transition using statistical physics tools, and proposes a novel guidance schedule with a negative-guidance window to mitigate variance shrinkage while preserving class separability.

Enrico Ventura, Beatrice Achilli, Luca Ambrogioni, Carlo LucibelloThu, 12 Ma📊 stat

Singular Bayesian Neural Networks

This paper proposes Singular Bayesian Neural Networks, which parameterize weights as low-rank matrices to induce a singular posterior that captures structured correlations, thereby achieving competitive predictive performance and improved uncertainty calibration with significantly fewer parameters and tighter generalization bounds compared to standard mean-field approaches.

Mame Diarra Toure, David A. StephensThu, 12 Ma📊 stat

Error Analysis of Bayesian Inverse Problems with Generative Priors

This paper presents a quantitative error analysis for Bayesian inverse problems utilizing generative priors, demonstrating that posterior errors inherit the convergence rates of the underlying Wasserstein-2 generative model and validating these theoretical findings through numerical experiments on benchmarks and an elliptic PDE inverse problem.

Bamdad Hosseini, Ziqi HuangThu, 12 Ma📊 stat

Transfer learning for functional linear regression via control variates

This paper proposes a control-variates-based transfer learning approach for scalar-on-function regression that utilizes dataset-specific summary statistics to preserve privacy, establishes a theoretical equivalence between offset and control-variates methods, and derives convergence rates that account for discretization errors and cross-dataset covariance similarities.

Yuping Yang, Zhiyang ZhouThu, 12 Ma📊 stat

Gradient Dynamics of Attention: How Cross-Entropy Sculpts Bayesian Manifolds

This paper provides a first-order analysis demonstrating that cross-entropy training in transformers induces a coupled, EM-like specialization of attention routing and value updates, which sculpts internal geometric manifolds that enable precise Bayesian probabilistic reasoning.

Naman Agarwal, Siddhartha R. Dalal, Vishal MisraThu, 12 Ma📊 stat

Maximum Risk Minimization with Random Forests

This paper introduces computationally efficient and statistically consistent Random Forest variants based on the Maximum Risk Minimization (MaxRM) principle to improve out-of-distribution generalization across multiple environments, offering novel guarantees for risks including mean squared error, negative reward, and regret.

Francesco Freni, Anya Fries, Linus Kühne, Markus Reichstein, Jonas PetersThu, 12 Ma📊 stat

Community detection in heterogeneous signed networks

This paper proposes a signed block $\beta$ -model that simultaneously captures strong and weak balance in heterogeneous signed networks, establishing its identifiability, developing an efficient optimization algorithm, and proving asymptotic consistency for both probability estimation and community detection.

Yuwen Wang, Shiwen Ye, Jingnan Zhang, Junhui WangThu, 12 Ma📊 stat

Empirical Orlicz norms

This paper establishes a law of large numbers for empirical Orlicz norms under minimal assumptions and investigates their central limit behavior, revealing that while standard convergence rates hold under specific conditions, canonical cases like the sub-Gaussian norm of normal variables exhibit nonstandard $n^{1/4}$ rates with stable limits, and the general class of distributions with bounded Orlicz norms admits no uniform rate of convergence.

Fabian MiesThu, 12 Ma📊 stat

Nonparametric bounds for vaccine effects in randomized trials

This paper relaxes the strong assumption of no unmeasured confounding between infection risk and vaccine belief in blinded randomized trials to derive nonparametric causal bounds for vaccine efficacy using linear programming and monotonicity-based methods, demonstrating their application through synthetic and semi-synthetic data.

Rachel Axelrod, Uri Obolski, Daniel NevoThu, 12 Ma📊 stat

Robust evaluation of treatment effects in longitudinal studies with truncation by death or other intercurrent events

This paper proposes the Pairwise Last Observation Time (PLOT) estimand, a novel, assumption-free causal inference method that robustly evaluates treatment effects in longitudinal studies by comparing matched individuals at their last common observation time before intercurrent events, thereby avoiding the unverifiable assumptions and sensitivity issues inherent in existing frameworks.

Georgi Baklicharov, Kelly Van Lancker, Stijn VansteelandtThu, 12 Ma📊 stat

← Previous Next →