cs.LG papers | Gist.Science

Ensembling Language Models with Sequential Monte Carlo

This paper introduces a unified framework for ensembling diverse language models via $f$ -ensemble distributions and proposes a byte-level sequential Monte Carlo algorithm to sample from these distributions, effectively overcoming challenges like mismatched vocabularies and biased approximations to improve performance on structured text generation tasks.

Robin Shing Moon Chan, Tianyu Liu, Samuel Kiegeland + 5 more2026-03-06🤖 cs.AI

On-Policy Self-Distillation for Reasoning Compression

The paper introduces OPSDC, an on-policy self-distillation method that trains reasoning models to generate more concise outputs by minimizing reverse KL divergence against their own "be concise" instructions, achieving significant token reduction and accuracy improvements on benchmarks like MATH-500 and AIME 2024 without requiring ground-truth answers or token budgets.

Hejian Sang, Yuanda Xu, Zhengze Zhou + 3 more2026-03-06🤖 cs.LG

Latent Wasserstein Adversarial Imitation Learning

This paper introduces Latent Wasserstein Adversarial Imitation Learning (LWAIL), a novel framework that leverages a pre-trained, dynamics-aware latent space to achieve expert-level performance using only one or a few state-only expert demonstrations, thereby overcoming the data and action requirements of traditional imitation learning methods.

Siqi Yang, Kai Yan, Alexander G. Schwing + 1 more2026-03-06🤖 cs.LG

Kraus Constrained Sequence Learning For Quantum Trajectories from Continuous Measurement

This paper proposes a physically constrained neural sequence learning framework that employs a Kraus-structured output layer to guarantee completely positive trace-preserving (CPTP) quantum state updates, demonstrating that a Kraus-LSTM architecture significantly outperforms unconstrained models and other backbones in reconstructing quantum trajectories under parameter drift.

Priyanshi Singh, Krishna Bhatia2026-03-06🤖 cs.LG

Thermodynamic Response Functions in Singular Bayesian Models

This paper establishes a unified thermodynamic response framework for singular Bayesian models, demonstrating that posterior tempering induces a hierarchy of observables that naturally interpret complex learning-theoretic quantities like the real log canonical threshold and WAIC as free-energy derivatives, thereby revealing phase-transition-like structural reorganizations in models such as neural networks and Gaussian mixtures.

Sean Plummer2026-03-06🔢 math

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

This paper introduces SurvHTE-Bench, the first comprehensive benchmark for evaluating heterogeneous treatment effect estimation in survival analysis, featuring a diverse suite of synthetic, semi-synthetic, and real-world datasets to enable rigorous, fair, and reproducible comparisons of causal survival methods.

Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss + 1 more2026-03-06🤖 cs.AI

Reasoning Theater: Disentangling Model Beliefs from Chain-of-Thought

This paper demonstrates that reasoning models often exhibit "performative chain-of-thought" by generating tokens without revealing their internal beliefs, yet activation probing can detect these hidden certainties early to enable significant token reduction while distinguishing genuine uncertainty in complex tasks.

Siddharth Boppana, Annabel Ma, Max Loeffler + 5 more2026-03-06🤖 cs.AI

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

This paper leverages Chinese open-weight LLMs that censor politically sensitive topics as a natural testbed to evaluate honesty elicitation and lie detection techniques, finding that methods like few-shot prompting and self-classification effectively increase truthful responses and detect falsehoods, though no approach completely eliminates deception.

Helena Casademunt, Bartosz Cywiński, Khoi Tran + 3 more2026-03-06🤖 cs.AI

Cheap Thrills: Effective Amortized Optimization Using Inexpensive Labels

This paper proposes a novel three-stage framework that combines inexpensive imperfect labels, supervised pretraining, and self-supervised refinement to achieve effective amortized optimization with significantly reduced costs and improved performance across challenging domains.

Khai Nguyen, Petros Ellinas, Anvita Bhagavathula + 1 more2026-03-06🔢 math

POET-X: Memory-efficient LLM Training by Scaling Orthogonal Transformation

The paper introduces POET-X, a memory-efficient and scalable variant of the POET framework that utilizes optimized orthogonal equivalence transformations to enable the stable pretraining of billion-parameter large language models on a single GPU, overcoming the high memory and computational costs of the original implementation.

Zeju Qiu, Lixin Liu, Adrian Weller + 2 more2026-03-06🤖 cs.AI

RoboPocket: Improve Robot Policies Instantly with Your Phone

RoboPocket is a smartphone-based system that enhances robot imitation learning by using AR visual foresight to guide targeted data collection and asynchronous online finetuning, thereby doubling data efficiency and enabling instant policy iteration without requiring physical robot execution.

Junjie Fang, Wendi Chen, Han Xue + 7 more2026-03-06🤖 cs.AI

Recurrent Action Transformer with Memory

The paper proposes the Recurrent Action Transformer with Memory (RATE), a novel architecture that integrates a recurrent memory mechanism into transformers to overcome context length limitations in partially observable environments, demonstrating superior performance in memory-intensive offline reinforcement learning tasks while remaining competitive on standard benchmarks.

Egor Cherepanov, Alexey Staroverov, Alexey K. Kovalev + 1 more2026-03-05🤖 cs.AI

Crystal-GFN: sampling crystals with desirable properties and constraints

This paper introduces Crystal-GFN, a multi-environment, continuous-discrete GFlowNet that sequentially samples crystal structural attributes to efficiently generate diverse, valid materials with specific desirable properties and hard constraints, thereby accelerating the discovery of novel solid-state materials.

Mila AI4Science, :, Alex Hernandez-Garcia + 11 more2026-03-05🤖 cs.LG

GeoTop: Advancing Image Classification with Geometric-Topological Analysis

GeoTop is a mathematically principled framework that unifies Topological Data Analysis and Lipschitz-Killing Curvatures to resolve the diagnostic ambiguity of topologically equivalent structures by integrating robust topological signatures with precise geometric features, thereby achieving superior accuracy and interpretability in image classification tasks such as skin lesion diagnosis.

Mariem Abaach, Ian Morilla2026-03-05🤖 cs.LG

Sample-Optimal Locally Private Hypothesis Selection and the Provable Benefits of Interactivity

This paper presents a sample-optimal, locally differentially private algorithm for hypothesis selection that achieves the information-theoretic lower bound of $\Theta(k/(\alpha^2 \min\{\varepsilon^2, 1\}))$ using only $O(\log \log k)$ rounds of interaction, thereby demonstrating the provable power of interactivity to overcome the $\Omega(k \log k)$ sample complexity barrier inherent in non-interactive approaches.

Alireza F. Pour, Hassan Ashtiani, Shahab Asoodeh2026-03-05🤖 cs.LG

Graph Neural Networks in EEG-based Emotion Recognition: A Survey

This survey provides a comprehensive review and unified framework for Graph Neural Networks in EEG-based emotion recognition, categorizing existing methods by graph construction stages to offer guidance on their unique physiological foundations while outlining open challenges and future directions.

Chenyu Liu, Yuqiu Deng, Yihao Wu + 10 more2026-03-05🤖 cs.LG

List Sample Compression and Uniform Convergence

This paper investigates the applicability of classical generalization principles to list PAC learning, demonstrating that while uniform convergence remains equivalent to learnability, the sample compression conjecture fails as there exist list-learnable classes that cannot be compressed, even with arbitrarily large output lists.

Steve Hanneke, Shay Moran, Tom Waknine2026-03-05🤖 cs.LG

Agnostic Tomography of Stabilizer Product States

This paper introduces the concept of agnostic tomography and presents an efficient algorithm that learns a stabilizer product state approximating an arbitrary quantum state as well as the best possible match within that class, running in polynomial time for constant fidelity.

Sabee Grewal, Vishnu Iyer, William Kretschmer + 1 more2026-03-05⚛️ quant-ph

A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving

This paper reviews and categorizes existing reward functions for reinforcement learning in autonomous driving into safety, comfort, progress, and traffic rule compliance, while highlighting their current limitations in standardization and context-awareness to propose future research directions for more robust and conflict-resolving reward designs.

Ahmed Abouelazm, Jonas Michel, J. Marius Zoellner2026-03-05🤖 cs.AI

Tracking solutions of time-varying variational inequalities

This paper extends tracking guarantees for time-varying variational inequalities to non-monotone functions and periodic problems without sublinear solution paths, while also demonstrating that the associated discrete dynamical systems can exhibit either convergence or provably chaotic behavior.

Hédi Hadiji, Sarah Sachs, Cristóbal Guzmán2026-03-05🤖 cs.LG

← Previous Next →