stat.ML papers | Gist.Science

sFRC for assessing hallucinations in medical image restoration

This paper proposes sFRC, a novel method that performs Fourier Ring Correlation analysis over small patches to robustly detect and quantify hallucinations in deep learning-based medical image restoration across various undersampled imaging problems.

Prabhat Kc, Rongping Zeng, Nirmal Soni + 1 more2026-03-06🔬 physics

Why the Brain Consolidates: Predictive Forgetting for Optimal Generalisation

This paper proposes that memory consolidation serves a computational role beyond mere stabilization, utilizing "predictive forgetting" to compress stored representations into a form that optimizes generalization by selectively retaining information that predicts future outcomes, a process necessitated by high-capacity encoding constraints and validated through simulations across diverse neural and transformer models.

Zafeirios Fountas, Adnan Oomerjee, Haitham Bou-Ammar + 2 more2026-03-06💻 cs

Distributional Equivalence in Linear Non-Gaussian Latent-Variable Cyclic Causal Models: Characterization and Learning

This paper presents the first structural-assumption-free causal discovery method for linear non-Gaussian latent-variable cyclic models by establishing a graphical criterion for distributional equivalence, introducing edge rank constraints, and providing an algorithm to recover models up to this equivalence class.

Haoyue Dai, Immanuel Albrecht, Peter Spirtes + 1 more2026-03-06💻 cs

The Inductive Bias of Convolutional Neural Networks: Locality and Weight Sharing Reshape Implicit Regularization

This paper demonstrates that the architectural inductive biases of locality and weight sharing in convolutional neural networks fundamentally alter implicit regularization by coupling learned filters to low-dimensional patch manifolds, thereby enabling generalization on high-dimensional spherical data where fully connected networks provably fail.

Tongtong Liang, Esha Singh, Rahul Parhi + 2 more2026-03-06💻 cs

How Does the ReLU Activation Affect the Implicit Bias of Gradient Descent on High-dimensional Neural Network Regression?

This paper demonstrates that for high-dimensional random data, gradient descent on shallow ReLU networks exhibits an implicit bias that approximates the minimum $L_2$ -norm solution with high probability, bridging the gap between worst-case non-existence and exact orthogonality results through a novel primal-dual analysis.

Kuo-Wei Lai, Guanghui Wang, Molei Tao + 1 more2026-03-06🔢 math

Non-Euclidean Gradient Descent Operates at the Edge of Stability

This paper extends the Edge of Stability phenomenon to non-Euclidean gradient descent by introducing a generalized sharpness measure based on directional smoothness, demonstrating that diverse optimizers exhibit similar stability thresholds and oscillatory behaviors across arbitrary geometric norms.

Rustem Islamov, Michael Crawshaw, Jeremy Cohen + 1 more2026-03-06🔢 math

Federated Causal Discovery Across Heterogeneous Datasets under Latent Confounding

This paper introduces fedCI and fedCI-IOD, a novel framework for privacy-preserving federated causal discovery that enables conditional independence testing and latent confounding analysis across heterogeneous, distributed datasets with non-identical variables and mixed data types.

Maximilian Hahn, Alina Zajak, Dominik Heider + 1 more2026-03-06🤖 cs.AI

Towards a data-scale independent regulariser for robust sparse identification of non-linear dynamics

This paper introduces the Sequential Thresholding of Coefficient of Variation (STCV), a novel sparse regression algorithm that replaces magnitude-based thresholding with a dimensionless statistical metric to ensure robust and accurate identification of governing equations in non-linear dynamics, even when data is normalized and noisy.

Jay Raut, Daniel N. Wilke, Stephan Schmidt2026-03-06🤖 cs.LG

Learning Optimal Individualized Decision Rules with Conditional Demographic Parity

This paper proposes a novel, computationally efficient framework for learning optimal individualized decision rules that incorporate demographic parity and conditional demographic parity constraints to mitigate discriminatory effects, supported by theoretical convergence guarantees and empirical validation.

Wenhai Cui, Wen Su, Donglin Zeng + 1 more2026-03-06🤖 cs.LG

Layer by layer, module by module: Choose both for optimal OOD probing of ViT

This paper demonstrates that distribution shift is the primary cause of performance degradation in deeper layers of Vision Transformers and reveals that optimal out-of-distribution probing requires selecting between feedforward network activations and normalized self-attention outputs depending on the severity of the shift.

Ambroise Odonnat, Vasilii Feofanov, Laetitia Chapel + 2 more2026-03-06🤖 cs.LG

Bayesian Supervised Causal Clustering

This paper proposes Bayesian Supervised Causal Clustering (BSCC), a novel framework that identifies homogeneous patient subgroups by simultaneously clustering individuals based on their covariate profiles and treatment effects, and validates its practical utility through simulations and real-world data from the International Stroke Trial.

Luwei Wang, Nazir Lone, Sohan Seth2026-03-06🤖 cs.LG

How important are the genes to explain the outcome - the asymmetric Shapley value as an honest importance metric for high-dimensional features

This paper proposes using asymmetric Shapley values as a superior metric for quantifying the importance of high-dimensional genomic features in clinical prediction models, addressing limitations of traditional approaches by accounting for collinearity and known causal directions, and provides efficient algorithms validated through a colorectal cancer progression study.

Mark A. van de Wiel, Jeroen Goedhart, Martin Jullum + 1 more2026-03-06🤖 cs.LG

Bayes with No Shame: Admissibility Geometries of Predictive Inference

This paper demonstrates that predictive inference is governed by four distinct, pairwise non-nested admissibility geometries—Blackwell risk dominance, anytime-valid supermartingales, marginal coverage, and Cesàro approachability—each offering a unique certificate of optimality and proving that admissibility is irreducibly relative to the chosen criterion rather than a universal property.

Nicholas G. Polson, Daniel Zantedeschi2026-03-06🔢 math

On the Statistical Optimality of Optimal Decision Trees

This paper establishes a comprehensive statistical theory for globally optimal empirical risk minimization decision trees by deriving sharp oracle inequalities and minimax optimal rates over a novel piecewise sparse heterogeneous anisotropic Besov space, thereby providing rigorous theoretical guarantees for their performance in high-dimensional regression and classification under both sub-Gaussian and heavy-tailed noise settings.

Zineng Xu, Subhroshekhar Ghosh, Yan Shuo Tan2026-03-06🔢 math

Harnessing Synthetic Data from Generative AI for Statistical Inference

This paper provides a comprehensive statistical review of synthetic data generated by modern AI models, outlining their benefits and limitations while offering principled frameworks and practical recommendations to ensure their valid and reliable use in scientific inference and prediction.

Ahmad Abdel-Azim, Ruoyu Wang, Xihong Lin2026-03-06🤖 cs.LG

Thermodynamic Response Functions in Singular Bayesian Models

This paper establishes a unified thermodynamic response framework for singular Bayesian models, demonstrating that posterior tempering induces a hierarchy of observables that naturally interpret complex learning-theoretic quantities like the real log canonical threshold and WAIC as free-energy derivatives, thereby revealing phase-transition-like structural reorganizations in models such as neural networks and Gaussian mixtures.

Sean Plummer2026-03-06🔢 math

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

This paper introduces SurvHTE-Bench, the first comprehensive benchmark for evaluating heterogeneous treatment effect estimation in survival analysis, featuring a diverse suite of synthetic, semi-synthetic, and real-world datasets to enable rigorous, fair, and reproducible comparisons of causal survival methods.

Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss + 1 more2026-03-06🤖 cs.AI

Sample-Optimal Locally Private Hypothesis Selection and the Provable Benefits of Interactivity

This paper presents a sample-optimal, locally differentially private algorithm for hypothesis selection that achieves the information-theoretic lower bound of $\Theta(k/(\alpha^2 \min\{\varepsilon^2, 1\}))$ using only $O(\log \log k)$ rounds of interaction, thereby demonstrating the provable power of interactivity to overcome the $\Omega(k \log k)$ sample complexity barrier inherent in non-interactive approaches.

Alireza F. Pour, Hassan Ashtiani, Shahab Asoodeh2026-03-05🤖 cs.LG

List Sample Compression and Uniform Convergence

This paper investigates the applicability of classical generalization principles to list PAC learning, demonstrating that while uniform convergence remains equivalent to learnability, the sample compression conjecture fails as there exist list-learnable classes that cannot be compressed, even with arbitrarily large output lists.

Steve Hanneke, Shay Moran, Tom Waknine2026-03-05🤖 cs.LG

Tracking solutions of time-varying variational inequalities

This paper extends tracking guarantees for time-varying variational inequalities to non-monotone functions and periodic problems without sublinear solution paths, while also demonstrating that the associated discrete dynamical systems can exhibit either convergence or provably chaotic behavior.

Hédi Hadiji, Sarah Sachs, Cristóbal Guzmán2026-03-05🤖 cs.LG

← Previous Next →