cs.LG papers | Gist.Science

Code-Space Response Oracles: Generating Interpretable Multi-Agent Policies with Large Language Models

This paper introduces Code-Space Response Oracles (CSRO), a novel framework that replaces black-box deep reinforcement learning oracles with Large Language Models to generate human-readable, interpretable multi-agent policies as code, achieving competitive performance while enabling the discovery of complex, explainable strategies.

Daniel Hennes, Zun Li, John Schultz, Marc Lanctot2026-03-12🤖 cs.AI

Denoising the US Census: Succinct Block Hierarchical Regression

This paper introduces BlueDown, a new post-processing algorithm that leverages succinct block hierarchical regression to produce more accurate and consistent demographic estimates for the US Census while maintaining the same privacy guarantees and structural constraints as the existing TopDown method.

Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Adam Sealfon2026-03-12🤖 cs.LG

Hardware Efficient Approximate Convolution with Tunable Error Tolerance for CNNs

This paper proposes a hardware-efficient "soft sparsity" paradigm for CNNs that utilizes a Most Significant Bit (MSB) proxy to skip negligible non-zero multiplications, achieving significant MAC and power reductions with zero accuracy loss while outperforming traditional zero-skipping methods.

Vishal Shashidhar, Anupam Kumari, Roy P Paily2026-03-12🤖 cs.LG

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

The paper introduces CLIPO, a method that integrates contrastive learning into policy optimization to generalize Reinforcement Learning with Verifiable Rewards (RLVR) by capturing invariant structures across correct reasoning paths, thereby mitigating hallucinations and improving the generalization and robustness of Large Language Models.

Sijia Cui, Pengyu Cheng, Jiajun Song, Yongbo Gai, Guojun Zhang, Zhechao Yu, Jianhe Lin, Xiaoxi Jiang, Guanjun Jiang2026-03-12🤖 cs.LG

Lost in the Middle at Birth: An Exact Theory of Transformer Position Bias

This paper argues that the "Lost in the Middle" phenomenon in large language models is an inherent geometric property of causal decoder architectures present at initialization, caused by the interplay of causal masking and residual connections that creates a structurally hostile "dead zone" in the middle of the context, a bias that persists even after standard pretraining.

Borun D Chowdhury2026-03-12🤖 cs.LG

Unbalanced Optimal Transport Dictionary Learning for Unsupervised Hyperspectral Image Clustering

This paper proposes an unsupervised hyperspectral image clustering method that improves upon existing Wasserstein space dictionary learning by utilizing unbalanced Wasserstein barycenters to learn a robust lower-dimensional representation, thereby mitigating issues of class blurring and sensitivity to outliers and noise.

Joshua Lentz, Nicholas Karris, Alex Cloninger, James M. Murphy2026-03-12📊 stat

A neural operator for predicting vibration frequency response curves from limited data

This paper proposes a neural operator integrated with an implicit numerical scheme that learns underlying state-space dynamics from limited data to accurately predict vibration frequency response curves for engineered components without relying on physics-based regularizing loss functions.

D. Bluedorn, A. Badawy, B. E. Saunders, D. Roettgen, A. Abdelkefi2026-03-12🤖 cs.LG

Mashup Learning: Faster Finetuning by Remixing Past Checkpoints

The paper proposes Mashup Learning, a method that accelerates LLM finetuning and improves downstream accuracy by identifying and merging relevant historical checkpoints to serve as an optimized initialization for new tasks, thereby reducing training time by up to 37% compared to training from scratch.

Sofia Maria Lo Cicero Vaina, Artem Chumachenko, Max Ryabinin2026-03-12🤖 cs.LG

ReMix: Reinforcement routing for mixtures of LoRAs in LLM finetuning

This paper proposes ReMix, a novel Mixture-of-LoRAs framework that employs non-learnable routing weights and a Reinforce Leave-One-Out (RLOO) gradient estimator to prevent routing imbalance, thereby ensuring all active LoRAs contribute equally and significantly outperforming state-of-the-art parameter-efficient finetuning methods.

Ruizhong Qiu, Hanqing Zeng, Yinglong Xia, Yiwen Meng, Ren Chen, Jiarui Feng, Dongqi Fu, Qifan Wang, Jiayi Liu, Jun Xiao, Xiangjun Fan, Benyu Zhang, Hong Li, Zhining Liu, Hyunsik Yoo, Zhichen Zeng, Tianxin Wei, Hanghang Tong2026-03-12🤖 cs.LG

DT-BEHRT: Disease Trajectory-aware Transformer for Interpretable Patient Representation Learning

This paper introduces DT-BEHRT, a graph-enhanced transformer model that improves predictive performance and interpretability in electronic health record analysis by explicitly modeling disease trajectories within organ systems and employing a novel pre-training strategy to capture heterogeneous medical code interactions.

Deyi Li, Zijun Yao, Qi Xu, Muxuan Liang, Lingyao Li, Zijian Xu, Mei Liu2026-03-12🤖 cs.LG

Stability and Robustness via Regularization: Bandit Inference via Regularized Stochastic Mirror Descent

This paper establishes a general stability criterion for stochastic mirror descent algorithms to enable valid statistical inference in adaptive bandit settings, introducing regularized-EXP3 variants that simultaneously achieve minimax-optimal regret, nominal confidence interval coverage, and robustness to adversarial corruptions.

Budhaditya Halder, Ishan Sengupta, Koustav Chowdhury, Koulik Khamaru2026-03-12📊 stat

ARCHE: Autoregressive Residual Compression with Hyperprior and Excitation

This paper introduces ARCHE, an end-to-end learned image compression framework that achieves state-of-the-art rate-distortion efficiency by unifying hierarchical, spatial, and channel-based priors with adaptive feature recalibration, all while maintaining computational efficiency without relying on recurrent or transformer-based components.

Sofia Iliopoulou, Dimitris Ampeliotis, Athanassios Skodras2026-03-12⚡ eess

Adaptive Activation Cancellation for Hallucination Mitigation in Large Language Models

This paper introduces Adaptive Activation Cancellation (AAC), a real-time, training-free inference framework that mitigates hallucinations in large language models by identifying and suppressing hallucination-associated neural activations as structured interference, thereby improving factual accuracy across multiple model scales without degrading general capabilities or fluency.

Eric Yocam, Varghese Vaidyan, Gurcan Comert, Paris Kalathas, Yong Wang, Judith L. Mwakalonge2026-03-12💬 cs.CL

Actor-Accelerated Policy Dual Averaging for Reinforcement Learning in Continuous Action Spaces

This paper proposes actor-accelerated Policy Dual Averaging, a method that employs a learned policy network to efficiently approximate optimization sub-problems in continuous action spaces, thereby maintaining theoretical convergence guarantees while achieving superior performance over standard on-policy baselines like PPO.

Ji Gao, Caleb Ju, Guanghui Lan, Zhaohui Tong2026-03-12🤖 cs.LG

Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

This paper proposes a hybrid Hidden Markov Model that combines Laplace quantile-defined market states with a Poisson-driven jump-duration mechanism to generate synthetic equity excess growth rates that simultaneously preserve heavy-tailed distributions, volatility clustering, and realistic tail-state dwell times, outperforming standard GARCH and HMM models in joint distributional and temporal fidelity.

Abdulrahman Alswaidan, Jeffrey D. Varner2026-03-12💰 q-fin

Flexible Cutoff Learning: Optimizing Machine Learning Potentials After Training

This paper introduces Flexible Cutoff Learning (FCL), a method that trains machine learning interatomic potentials with randomly sampled cutoff radii to enable post-training optimization of per-atom cutoffs, thereby significantly reducing computational costs for specific applications without requiring retraining.

Rick Oerder (Institute for Numerical Simulation, University of Bonn, Fraunhofer Institute for Algorithms and Scientific Computing SCAI), Jan Hamaekers (Fraunhofer Institute for Algorithms and Scientific Computing SCAI)2026-03-12🔬 cond-mat.mtrl-sci

FusionNet: a frame interpolation network for 4D heart models

The paper introduces FusionNet, a neural network that reconstructs high-temporal-resolution 4D cardiac motion from short-duration CMR scans by estimating intermediate 3D heart shapes, achieving superior accuracy with a Dice coefficient exceeding 0.897 compared to existing methods.

Chujie Chang, Shoko Miyauchi, Ken'ichi Morooka, Ryo Kurazume, Oscar Martinez Mozos2026-03-12🤖 cs.LG

SDSR: A Spectral Divide-and-Conquer Approach for Species Tree Reconstruction

The paper introduces SDSR, a scalable spectral divide-and-conquer algorithm for species tree reconstruction that achieves up to 10-fold faster runtimes compared to standard methods while maintaining comparable accuracy under the multispecies coalescent model.

Ortal Reshef (Hebrew University of Jerusalem), Ofer Glassman (Weizmann Institute of Science), Or Zuk (Hebrew University of Jerusalem), Yariv Aizenbud (Tel Aviv University), Boaz Nadler (Weizmann Institute of Science), Ariel Jaffe (Hebrew University of Jerusalem)2026-03-12🧬 q-bio

A Diffusion Analysis of Policy Gradient for Stochastic Bandits

This paper establishes that a continuous-time diffusion approximation of policy gradient for stochastic bandits achieves logarithmic regret with a learning rate of $O(\Delta^2/\log(n))$ , while demonstrating that a significantly smaller learning rate of $O(\Delta^2)$ is necessary to avoid linear regret in specific instances.

Tor Lattimore2026-03-12📊 stat

Rethinking the Harmonic Loss via Non-Euclidean Distance Layers

This paper extends the harmonic loss framework by systematically evaluating various non-Euclidean distance metrics across vision and language models, demonstrating that cosine-based variants offer superior trade-offs in accuracy, interpretability, and sustainability compared to traditional cross-entropy and Euclidean approaches.

Maxwell Miller-Golub, Kamil Faber, Marcin Pietron, Panpan Zheng, Pasquale Minervini, Roberto Corizzo2026-03-12🤖 cs.LG

← Previous Next →