cs.LG papers | Gist.Science

A Trust-Region Interior-Point Stochastic Sequential Quadratic Programming Method

This paper proposes a trust-region interior-point stochastic sequential quadratic programming (TR-IP-SSQP) method that utilizes adaptive stochastic oracles to solve optimization problems with stochastic objectives and deterministic nonlinear constraints, proving its global almost-sure convergence to first-order stationary points and demonstrating practical performance on benchmark and logistic regression problems.

Yuchen Fang, Jihun Kim, Sen Na, James Demmel, Javad Lavaei2026-03-12🔢 math

Why Does It Look There? Structured Explanations for Image Classification

The paper proposes I2X, a framework that transforms unstructured interpretability into structured, prototype-based explanations to reveal model decision-making processes and actively improve classification accuracy through targeted sample perturbation.

Jiarui Li, Zixiang Yin, Samuel J Landry, Zhengming Ding, Ramgopal R. Mettu2026-03-12🤖 cs.LG

One Adapter for All: Towards Unified Representation in Step-Imbalanced Class-Incremental Learning

This paper introduces One-A, a unified framework for step-imbalanced class-incremental learning that employs asymmetric subspace alignment and directional gating to merge task updates into a single adapter, thereby balancing stability and plasticity while maintaining constant inference costs.

Xiaoyan Zhang, Jiangpeng He2026-03-12🤖 cs.LG

Intrinsic Numerical Robustness and Fault Tolerance in a Neuromorphic Algorithm for Scientific Computing

This paper demonstrates that a natively spiking neuromorphic algorithm for solving partial differential equations possesses intrinsic fault tolerance, maintaining accuracy even when up to 32% of neurons and 90% of spikes are dropped, with this robustness being tunable via structural hyperparameters.

Bradley H. Theilman, James B. Aimone2026-03-12🤖 cs.AI

SiMPO: Measure Matching for Online Diffusion Reinforcement Learning

This paper introduces SiMPO, a unified framework for online diffusion reinforcement learning that generalizes policy reweighting through a two-stage measure matching approach, enabling the principled use of signed measures and negative reweighting to effectively repel policies from suboptimal actions and achieve superior performance.

Haitong Ma, Chenxiao Gao, Tianyi Chen, Na Li, Bo Dai2026-03-12🤖 cs.LG

Bayesian Hierarchical Models and the Maximum Entropy Principle

This paper demonstrates that when the prior given hyperparameters in a Bayesian hierarchical model is a canonical maximum entropy distribution, the resulting dependent marginal prior also possesses a maximum entropy property, but with respect to a constraint on the marginal distribution of a function of the unknown parameters, thereby clarifying the implicit information assumptions of such models.

Brendon J. Brewer2026-03-12📊 stat

Improving TabPFN's Synthetic Data Generation by Integrating Causal Structure

This paper addresses the limitation of TabPFN's autoregressive synthetic data generation, which produces spurious correlations when feature order conflicts with causal structure, by introducing DAG-aware and CPDAG-based conditioning strategies that significantly improve the fidelity, stability, and causal preservation of the generated synthetic tabular data.

Davide Tugnoli, Andrea De Lorenzo, Marco Virgolin, Giovanni Cinà2026-03-12🤖 cs.LG

Discovery of a Hematopoietic Manifold in scGPT Yields a Method for Extracting Performant Algorithms from Biological Foundation Model Internals

This paper introduces a novel three-stage mechanistic interpretability method that extracts a compact, high-performing hematopoietic algorithm directly from the internal attention weights of the scGPT foundation model, achieving superior zero-shot classification and pseudotime ordering on independent datasets with significantly fewer parameters and training time than standard probing or retraining approaches.

Ihor Kendiukhov2026-03-12🧬 q-bio

From Prior to Pro: Efficient Skill Mastery via Distribution Contractive RL Finetuning

The paper introduces DICE-RL, a sample-efficient reinforcement learning framework that refines pretrained generative robot policies into high-performing experts by using distribution contraction to amplify successful behaviors from online feedback, enabling mastery of complex long-horizon manipulation tasks from pixel inputs in both simulation and real-world settings.

Zhanyi Sun, Shuran Song2026-03-12🤖 cs.LG

Estimating condition number with Graph Neural Networks

This paper proposes a fast graph neural network-based method for estimating the condition numbers of sparse matrices with linear complexity relative to the number of non-zero elements, demonstrating significant speedups over traditional Hager-Higham and Lanczos methods.

Erin Carson, Xinye Chen2026-03-12🤖 cs.LG

Robust Post-Training for Generative Recommenders: Why Exponential Reward-Weighted SFT Outperforms RLHF

This paper proposes and validates exponential reward-weighted SFT as a robust, fully offline post-training method for generative recommenders that eliminates reward hacking and propensity score requirements while offering theoretical guarantees and a controllable tradeoff between robustness and performance.

Keertana Chidambaram, Sanath Kumar Krishnamurthy, Qiuling Xu, Ko-Jen Hsiao, Moumita Bhattacharya2026-03-12🤖 cs.LG

Taming Score-Based Denoisers in ADMM: A Convergent Plug-and-Play Framework

This paper introduces ADMM-PnP with a novel AC-DC denoiser that resolves manifold mismatch and establishes convergence guarantees for score-based generative models within the ADMM framework, thereby improving solution quality across various inverse problems.

Rajesh Shrestha, Xiao Fu2026-03-12🤖 cs.LG

GSVD for Geometry-Grounded Dataset Comparison: An Alignment Angle Is All You Need

This paper proposes a geometry-grounded framework for comparing datasets using the Generalized Singular Value Decomposition (GSVD) to derive an interpretable "angle score" that quantifies whether individual samples are better explained by one dataset, the other, or both.

Eduarda de Souza Marques, Arthur Sobrinho Ferreira da Rocha, Joao Paixao, Heudson Mirandola, Daniel Sadoc Menasche2026-03-12🤖 cs.LG

Copula-ResLogit: A Deep-Copula Framework for Unobserved Confounding Effects

The paper introduces Copula-ResLogit, a novel deep learning framework that combines ResNet architectures with copula models to detect and mitigate unobserved confounding effects in travel demand analysis, thereby revealing true causal relationships in case studies involving pedestrian stress and travel mode choices.

Kimia Kamal, Bilal Farooq2026-03-12🤖 cs.LG

MultiwayPAM: Multiway Partitioning Around Medoids for LLM-as-a-Judge Score Analysis

The paper introduces MultiwayPAM, a novel tensor clustering method designed to simultaneously estimate cluster memberships and medoids for questions, answerers, and evaluators in LLM-as-a-Judge score tensors, thereby addressing computational costs and revealing inherent evaluator biases.

Chihiro Watanabe, Jingyu Sun2026-03-12📊 stat

Quantum entanglement provides a competitive advantage in adversarial games

This study demonstrates that quantum entanglement serves as a functional resource in competitive reinforcement learning, enabling hybrid quantum-classical agents trained on the game Pong to consistently outperform separable quantum circuits and match or exceed classical baselines by learning structurally distinct features that better model dynamic agent interactions.

Peiyong Wang, Kieran Hymas, James Quach2026-03-12⚛️ quant-ph

Hybrid Self-evolving Structured Memory for GUI Agents

This paper introduces HyMEM, a hybrid self-evolving structured memory system that combines discrete symbolic nodes with continuous embeddings in a graph format to significantly enhance the performance of open-source GUI agents, enabling them to match or surpass strong closed-source models on complex, long-horizon tasks.

Sibo Zhu, Wenyi Wu, Kun Zhou, Stephen Wang, Biwei Huang2026-03-12🤖 cs.AI

GaLoRA: Parameter-Efficient Graph-Aware LLMs for Node Classification

GaLoRA is a parameter-efficient framework that integrates structural information into large language models for text-attributed graph node classification, achieving state-of-the-art performance with only 0.24% of the parameters required for full fine-tuning.

Mayur Choudhary, Saptarshi Sengupta, Katerina Potika2026-03-12🤖 cs.LG

Regime-aware financial volatility forecasting via in-context learning

This paper introduces a regime-aware in-context learning framework that leverages pretrained large language models to forecast financial volatility by dynamically adapting to nonstationary market conditions through oracle-guided, regime-specific demonstrations without requiring parameter fine-tuning.

Saba Asaad, Shayan Mohajer Hamidi, Ali Bereyhi2026-03-12🤖 cs.LG

What do near-optimal learning rate schedules look like?

This paper introduces a search procedure to identify near-optimal learning rate schedule shapes across various workloads, revealing that while warmup and decay are robust features, commonly used schedules are suboptimal and the ideal shape is significantly influenced by hyperparameters like weight decay.

Hiroki Naganuma, Atish Agarwala, Priya Kasimbeg, George E. Dahl2026-03-12🤖 cs.LG

← Previous Next →