cs.LG papers | Gist.Science

Dual Space Preconditioning for Gradient Descent in the Overparameterized Regime

This paper establishes the convergence of Dual Space Preconditioned Gradient Descent to an interpolating solution for overparameterized linear models using novel Bregman divergence techniques, while characterizing its implicit bias to show that isotropic preconditioners recover standard gradient descent solutions and general preconditioners yield solutions within a constant factor of the standard solution.

Reza Ghane, Danil Akhtiamov, Babak Hassibi2026-03-12📊 stat

JEDI: Jointly Embedded Inference of Neural Dynamics

The paper introduces JEDI, a hierarchical model that jointly learns shared embeddings over recurrent neural network weights to robustly infer generalizable, task-specific neural dynamics from noisy, high-dimensional experimental recordings, successfully recovering underlying mechanistic structures and providing insights into motor control.

Anirudh Jamkhandi, Ali Korojy, Olivier Codol, Guillaume Lajoie, Matthew G. Perich2026-03-12🧬 q-bio

A Universal Nearest-Neighbor Estimator for Intrinsic Dimensionality

This paper introduces a universal, nearest-neighbor-based estimator for intrinsic dimensionality that achieves state-of-the-art performance through simple calculations and theoretical guarantees of convergence independent of the underlying data distribution.

Eng-Jon Ong, Omer Bobrowski, Gesine Reinert, Primoz Skraba2026-03-12🤖 cs.LG

VERI-DPO: Evidence-Aware Alignment for Clinical Summarization via Claim Verification and Direct Preference Optimization

The paper introduces VERI-DPO, an evidence-aware alignment framework that leverages claim verification to mine preference pairs for Direct Preference Optimization, significantly reducing unsupported claims and improving the faithfulness of clinical summarizations while maintaining informative length.

Weixin Liu, Congning Ni, Qingyuan Song, Susannah L. Rose, Christopher Symons, Murat Kantarcioglu, Bradley A. Malin, Zhijun Yin2026-03-12💬 cs.CL

A New Tensor Network: Tubal Tensor Train and Its Applications

This paper introduces the tubal tensor train (TTT) decomposition, a novel tensor network model that integrates t-product algebra with the tensor train structure to achieve linear storage scaling for high-order tensors, and validates its effectiveness through efficient algorithms and applications in image/video compression, tensor completion, and hyperspectral imaging.

Salman Ahmadi-Asl, Valentin Leplat, Anh-Huy Phan, Andrzej Cichocki2026-03-12🔢 math

Resource-constrained Amazons chess decision framework integrating large language models and graph attention

This paper proposes a lightweight hybrid framework for the Game of the Amazons that integrates Graph Attention Autoencoders, Stochastic Graph Genetic Algorithms, and GPT-4o-mini to overcome resource constraints, achieving decision accuracy improvements of 15%–56% over baselines and outperforming its teacher model by effectively denoising LLM outputs through structural graph reasoning.

Tianhao Qian, Zhuoxuan Li, Jinde Cao, Xinli Shi, Hanjie Liu, Leszek Rutkowski2026-03-12🤖 cs.AI

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

The paper introduces IH-Challenge, a reinforcement learning dataset designed to enhance instruction hierarchy robustness in frontier LLMs, which significantly improves their ability to prioritize instructions against conflicts and adversarial attacks while maintaining helpfulness and minimizing capability regression.

Chuan Guo (Michael Pokorny), Juan Felipe Ceron Uribe (Michael Pokorny), Sicheng Zhu (Michael Pokorny), Christopher A. Choquette-Choo (Michael Pokorny), Steph Lin (Michael Pokorny), Nikhil Kandpal (Michael Pokorny), Milad Nasr (Michael Pokorny), Rai (Michael Pokorny), Sam Toyer, Miles Wang, Yaodong Yu, Alex Beutel, Kai Xiao2026-03-12🤖 cs.AI

World Model for Battery Degradation Prediction Under Non-Stationary Aging

This paper proposes a world model framework for lithium-ion battery degradation prognosis that encodes cycle data into latent states and propagates them forward using learned dynamics, demonstrating that iterative rollout significantly reduces trajectory forecast error compared to direct regression while a Single Particle Model constraint specifically enhances prediction accuracy at the degradation knee.

Kai Chin Lim, Khay Wai See2026-03-12⚡ eess

UAV-MARL: Multi-Agent Reinforcement Learning for Time-Critical and Dynamic Medical Supply Delivery

This paper presents a Multi-Agent Reinforcement Learning framework using Proximal Policy Optimization to coordinate UAV fleets for time-critical medical supply delivery, demonstrating that classical PPO outperforms asynchronous and sequential strategies in dynamically prioritizing tasks and reallocating resources under uncertain conditions using real-world geographic data.

Islam Guven, Mehmet Parlak2026-03-12🤖 cs.LG

Tackling Length Inflation Without Trade-offs: Group Relative Reward Rescaling for Reinforcement Learning

This paper introduces Group Relative Reward Rescaling (GR $^3$ ), a novel reinforcement learning method that effectively mitigates length inflation in large language models by reframing length control as a multiplicative rescaling paradigm, thereby achieving lossless optimization and superior performance compared to existing baselines without compromising downstream capabilities.

Zichao Li, Jie Lou, Fangchen Dong, Zhiyuan Fan, Mengjie Ren, Hongyu Lin, Xianpei Han, Debing Zhang, Le Sun, Yaojie Lu, Xing Yu2026-03-12🤖 cs.LG

SCORE: Replacing Layer Stacking with Contractive Recurrent Depth

The paper introduces SCORE, a lightweight deep learning architecture that replaces traditional layer stacking with a weight-shared, ODE-inspired contractive recurrent update mechanism to improve training stability, accelerate convergence, and reduce parameter counts across various model types.

Guillaume Godin2026-03-12✓ Author reviewed ⓘ🤖 cs.LG

Learning to Score: Tuning Cluster Schedulers through Reinforcement Learning

This paper proposes a reinforcement learning approach that automatically tunes cluster scheduler scoring weights using percentage improvement rewards, frame-stacking, and limited domain information to significantly improve end-to-end job performance across diverse workloads and cluster setups.

Martin Asenov, Qiwen Deng, Gingfung Yeung, Adam Barker2026-03-12🤖 cs.LG

A Bipartite Graph Approach to U.S.-China Cross-Market Return Forecasting

This paper introduces a machine learning framework that utilizes a directed bipartite graph to model time-ordered predictive linkages between U.S. and Chinese equity markets, revealing a significant directional asymmetry where U.S. returns strongly predict Chinese intraday performance while the reverse effect is limited.

Jing Liu, Maria Grith, Xiaowen Dong, Mihai Cucuringu2026-03-12💰 q-fin

Quantization Robustness of Monotone Operator Equilibrium Networks

This paper establishes theoretical conditions under which Monotone Operator Equilibrium Networks maintain convergence and bounded error under weight quantization by analyzing spectral perturbations, and validates these findings through experiments showing that quantization-aware training can recover provable convergence at four-bit precision.

James Li, Philip H. W. Leong, Thomas Chaffey2026-03-12⚡ eess

Riemannian Geometry-Preserving Variational Autoencoder for MI-BCI Data Augmentation

This paper introduces the Riemannian geometry-preserving variational autoencoder (RGP-VAE), a generative model that produces high-fidelity, symmetric positive-definite synthetic EEG covariance matrices to effectively augment data and enhance performance in motor imagery brain-computer interface applications.

Viktorija Polaka, Ivo Pascal de Jong, Andreea Ioana Sburlea2026-03-12🤖 cs.LG

Implicit Statistical Inference in Transformers: Approximating Likelihood-Ratio Tests In-Context

This paper demonstrates that Transformers performing in-context learning on binary hypothesis testing tasks effectively approximate Bayes-optimal statistical estimators by dynamically adapting their internal decision mechanisms—ranging from voting-style ensembles for linear tasks to deeper sequential computations for nonlinear ones—rather than relying on simple similarity matching or fixed heuristics.

Faris Chaudhry, Siddhant Gadkari2026-03-12🤖 cs.LG

HAPEns: Hardware-Aware Post-Hoc Ensembling for Tabular Data

The paper introduces HAPEns, a novel post-hoc ensembling method for tabular data that constructs diverse ensembles along the Pareto front of predictive performance and hardware efficiency, significantly outperforming existing baselines across 83 datasets by explicitly balancing accuracy with resource constraints.

Jannis Maier, Lennart Purucker2026-03-12🤖 cs.LG

Does LLM Alignment Really Need Diversity? An Empirical Study of Adapting RLVR Methods for Moral Reasoning

This paper empirically demonstrates that contrary to the hypothesis that moral reasoning alignment requires diversity-seeking algorithms, standard reward-maximizing RLVR methods are equally or more effective because high-reward moral responses exhibit a concentrated distribution in semantic space similar to logical reasoning tasks.

Zhaowei Zhang, Xiaohan Liu, Xuekai Zhu, Junchao Huang, Ceyao Zhang, Zhiyuan Feng, Yaodong Yang, Xiaoyuan Yi, Xing Xie2026-03-12🤖 cs.AI

Gradient Flow Drifting: Generative Modeling via Wasserstein Gradient Flows of KDE-Approximated Divergences

This paper establishes a mathematical framework called Gradient Flow Drifting that proves the equivalence between the recently proposed Drifting Model and the Wasserstein gradient flow of the forward KL divergence under KDE approximation, while extending the approach to a mixed-divergence strategy on Riemannian manifolds to simultaneously mitigate mode collapse and blurring.

Jiarui Cao, Zixuan Wei, Yuxin Liu2026-03-12🤖 cs.LG

Self-Scaled Broyden Family of Quasi-Newton Methods in JAX

This technical note presents a JAX-compatible implementation of the Self-Scaled Broyden family of quasi-Newton methods, including BFGS, DFP, and Broyden variants with Zoom line search, built upon the Optimistix library to facilitate their adoption within the JAX community.

Ivan Bioli, Mikel Mendibe Abarrategi2026-03-12🤖 cs.LG

← Previous Next →