Gradient Dynamics of Attention: How Cross-Entropy Sculpts Bayesian Manifolds

This paper provides a first-order analysis demonstrating that cross-entropy training in transformers induces a coupled specialization of attention routing and value updates—functioning as a two-timescale EM procedure—that sculpts low-dimensional Bayesian manifolds, thereby explaining how gradient-based optimization enables precise probabilistic reasoning.

Naman Agarwal, Siddhartha R. Dalal, Vishal MisraThu, 12 Ma📊 stat

Offline Dynamic Inventory and Pricing Strategy: Addressing Censored and Dependent Demand

This paper proposes a novel data-driven framework using offline reinforcement learning and survival analysis to estimate optimal pricing and inventory control policies in sequential environments with censored and dependent demand, overcoming challenges like missing profit information and non-stationarity by approximating the problem as a high-order Markov decision process.

Korel Gundem, Zhengling QiThu, 12 Ma📊 stat

Losing dimensions: Geometric memorization in generative diffusion

This paper proposes a geometric memorization theory demonstrating that diffusion models transition from generalization to exact copying through a smooth, gradual collapse of latent dimensionality, where salient features and finer details progressively "freeze out" as data becomes scarce, mirroring physical systems condensing into low-energy configurations.

Beatrice Achilli, Enrico Ventura, Gianluigi Silvestri, Bao Pham, Gabriel Raya, Dmitry Krotov, Carlo Lucibello, Luca AmbrogioniThu, 12 Ma📊 stat

When should we trust the annotation? Selective prediction for molecular structure retrieval from mass spectra

This paper introduces a selective prediction framework for molecular structure retrieval from mass spectra that leverages retrieval-level uncertainty and distribution-free risk control to allow models to abstain from low-confidence predictions, thereby ensuring annotations meet specified error rate constraints in high-stakes applications.

Mira Jürgens, Gaetan De Waele, Morteza Rakhshaninejad, Willem WaegemanThu, 12 Ma📊 stat

Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

This paper presents a unified Bayesian optimization framework using Gaussian processes with derivative observations and advanced extensions like Optimal Transport and random Fourier features to efficiently accelerate the search for minima and saddle points on potential energy surfaces, bridging theoretical formulation with practical implementation through accompanying Rust code.

Rohit Goswami (Institute IMX and Lab-COSMO, École polytechnique fédérale de Lausanne)Thu, 12 Ma📊 stat