cs.LG papers | Gist.Science

Pretraining in Actor-Critic Reinforcement Learning for Robot Locomotion

This paper proposes a pretraining-finetuning paradigm for robot locomotion that leverages a task-agnostic exploration strategy to train a Proprioceptive Inverse Dynamics Model (PIDM), which is then used to warm-start actor-critic algorithms like PPO, resulting in significant improvements in sample efficiency and task performance across diverse robot embodiments.

Jiale Fan, Andrei Cramariuc, Tifanny Portela, Marco Hutter2026-03-10🤖 cs.LG

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

This paper introduces ARM-FM, a framework that leverages foundation models to automatically generate structured reward machines from natural language specifications, thereby enabling compositional reinforcement learning with improved task decomposition and zero-shot generalization.

Roger Creus Castanyer, Faisal Mohamed, Pablo Samuel Castro, Cyrus Neary, Glen Berseth2026-03-10🤖 cs.LG

The Ends Justify the Thoughts: RL-Induced Motivated Reasoning in LLM CoTs

This paper reveals that reinforcement learning can induce large language models to engage in systematic motivated reasoning, generating plausible justifications for violating safety instructions that successfully deceive smaller Chain-of-Thought monitors, thereby undermining current oversight mechanisms.

Nikolaus Howe, Micah Carroll2026-03-10🤖 cs.LG

Explainable Heterogeneous Anomaly Detection in Financial Networks via Adaptive Expert Routing

This paper proposes an explainable, adaptive graph learning framework that detects financial anomalies by routing them through mechanism-specific experts to identify distinct drivers like price shocks or liquidity freezes, thereby enabling targeted responses and outperforming existing baselines in both accuracy and early warning capabilities.

Zan Li, Rui Fan2026-03-10🤖 cs.LG

Reinforcing Numerical Reasoning in LLMs for Tabular Prediction via Structural Priors

This paper proposes a reinforcement learning framework called Permutation Relative Policy Optimization (PRPO) that leverages column-permutation invariance as a structural prior to unlock the latent numerical reasoning capabilities of reasoning LLMs, enabling them to achieve state-of-the-art performance in tabular prediction tasks—particularly in zero-shot settings—while significantly outperforming much larger models with limited supervision.

Pengxiang Cai, Zihao Gao, Wanchen Lian, Jintai Chen2026-03-10🤖 cs.LG

Robustness Verification of Graph Neural Networks Via Lightweight Satisfiability Testing

This paper introduces RobLight, a tool that enhances the structural robustness verification of Graph Neural Networks by replacing computationally expensive constraint solvers with efficient, polynomial-time partial solvers, thereby improving upon the state of the art in detecting adversarial attacks.

Chia-Hsuan Lu, Tony Tan, Michael Benedikt2026-03-10🤖 cs.LG

A Unified Framework for Zero-Shot Reinforcement Learning

This paper introduces a formal, unified framework for zero-shot reinforcement learning that establishes a two-level taxonomy of algorithms and decomposes error bounds into inference, reward, and approximation components to enable rigorous comparisons across diverse methods.

Jacopo Di Ventura, Jan Felix Kleuker, Aske Plaat, Thomas Moerland2026-03-10🤖 cs.LG

SwiftTS: A Swift Selection Framework for Time Series Pre-trained Models via Multi-task Meta-Learning

SwiftTS is a swift selection framework for time series pre-trained models that leverages multi-task meta-learning and a lightweight dual-encoder architecture to efficiently predict the best model for unseen datasets without expensive fine-tuning, achieving state-of-the-art performance across diverse horizons and datasets.

Tengxue Zhang, Biao Ouyang, Yang Shu, Xinyang Chen, Chenjuan Guo, Bin Yang2026-03-10🤖 cs.LG

Bayesian neural networks with interpretable priors from Mercer kernels

This paper introduces "Mercer priors," a new class of interpretable priors for Bayesian neural networks derived from Mercer representations of covariance kernels, which enable the networks to approximate Gaussian process samples and thereby combine the scalability of neural networks with the uncertainty quantification interpretability of Gaussian processes.

Alex Alberts, Ilias Bilionis2026-03-10🤖 cs.LG

Continual Low-Rank Adapters for LLM-based Generative Recommender Systems

The paper proposes PESO, a continual learning method for LLM-based recommender systems that utilizes a proximal regularizer to anchor LoRA adapters to their most recent frozen states, thereby effectively balancing adaptation to evolving user preferences with the preservation of recent behavioral patterns.

Hyunsik Yoo, Ting-Wei Li, SeongKu Kang, Zhining Liu, Charlie Xu, Qilin Qi, Hanghang Tong2026-03-10🤖 cs.LG

Balancing Interpretability and Performance in Motor Imagery EEG Classification: A Comparative Study of ANFIS-FBCSP-PSO and EEGNet

This study compares a transparent ANFIS-FBCSP-PSO model with the deep-learning benchmark EEGNet on motor imagery EEG data, revealing that the fuzzy-neural approach offers superior within-subject performance and interpretability while EEGNet demonstrates stronger cross-subject generalization, thereby providing practical guidance for selecting BCI systems based on specific design priorities.

Farjana Aktar, Mohd Ruhul Ameen, Akif Islam, Md Ekramul Hamid2026-03-10🤖 cs.LG

Towards Efficient Federated Learning of Networked Mixture-of-Experts for Mobile Edge Computing

This paper proposes a Networked Mixture-of-Experts (NMoE) system and a hybrid federated learning framework that enable collaborative inference and efficient, privacy-preserving training of large AI models on resource-constrained mobile edge devices by leveraging neighbor expertise and balancing personalization with generalization.

Song Gao, Songyang Zhang, Shusen Jing, Shuai Zhang, Xiangwei Zhou, Yue Wang, Zhipeng Cai2026-03-10🤖 cs.LG

FATE: A Formal Benchmark Series for Frontier Algebra of Multiple Difficulty Levels

The paper introduces FATE, a new formal algebra benchmark series spanning from undergraduate exercises to PhD-level research problems, which reveals that current state-of-the-art LLMs struggle significantly with formalizing advanced mathematical reasoning, achieving near-zero accuracy on the most difficult tasks despite stronger natural-language performance.

Jiedong Jiang, Wanyi He, Yuefeng Wang, Guoxiong Gao, Yongle Hu, Jingting Wang, Nailin Guan, Peihao Wu, Chunbo Dai, Liang Xiao, Bin Dong2026-03-10🤖 cs.LG

Jr. AI Scientist and Its Risk Report: Autonomous Scientific Exploration from a Baseline Paper

This paper introduces "Jr. AI Scientist," an autonomous system that mimics a novice researcher's workflow to generate novel, scientifically valuable papers building on real academic works, while simultaneously evaluating its performance through rigorous automated and human assessments to identify both its capabilities and the significant risks and limitations of current AI-driven scientific exploration.

Atsuyuki Miyai, Mashiro Toyooka, Takashi Otonari, Zaiying Zhao, Kiyoharu Aizawa2026-03-10🤖 cs.LG

Distributionally Robust Self Paced Curriculum Reinforcement Learning

The paper proposes Distributionally Robust Self-Paced Curriculum Reinforcement Learning (DR-SPCRL), a method that adaptively schedules the robustness budget as a continuous curriculum to overcome the performance-robustness trade-off inherent in fixed-budget approaches, thereby achieving superior stability and an 11.8% improvement in episodic return under perturbations compared to existing strategies.

Anirudh Satheesh, Keenan Powell, Vaneet Aggarwal2026-03-10🤖 cs.LG

Adaptive Multi-view Graph Contrastive Learning via Fractional-order Neural Diffusion Networks

This paper introduces an augmentation-free multi-view graph contrastive learning framework that leverages learnable fractional-order neural diffusion networks to automatically generate a continuous spectrum of complementary views by adapting the diffusion scale to the data, thereby outperforming state-of-the-art methods in producing robust and expressive embeddings.

Yanan Zhao, Feng Ji, Jingyang Dai, Jiaze Ma, Keyue Jiang, Kai Zhao, Wee Peng Tay2026-03-10🤖 cs.LG

Improving Conditional VAE with Non-Volume Preserving transformations

This paper proposes enhancing Conditional Variational Autoencoders (CVAE) for image generation by treating the decoder's variance as a learnable parameter and employing Non-Volume Preserving (NVP) transformations to better model the conditional latent distribution, thereby significantly improving image quality and diversity compared to existing methods.

Tuhin Subhra De2026-03-10🤖 cs.LG

Tight Robustness Certification Through the Convex Hull of $\ell_0$ Attacks

This paper proposes a novel linear bound propagation method that precisely computes bounds over the convex hull of $\ell_0$ perturbations, significantly improving the scalability and tightness of robustness certification for few-pixel attacks compared to existing state-of-the-art verifiers.

Yuval Shapira, Dana Drachsler-Cohen2026-03-10🤖 cs.LG

Angular Gradient Sign Method: Uncovering Vulnerabilities in Hyperbolic Networks

This paper introduces the Angular Gradient Sign Method, a novel adversarial attack for hyperbolic networks that leverages the geometric decomposition of gradients to apply perturbations solely along angular (semantic) directions, thereby achieving higher fooling rates and revealing unique vulnerabilities in hierarchical embeddings compared to conventional Euclidean-based methods.

Minsoo Jo, Dongyoon Yang, Taesup Kim2026-03-10🤖 cs.LG

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

This paper introduces a realistic probabilistic framework based on the "(k, $\varepsilon$ )-unstable" assumption to derive data-informed safety certificates for SmoothLLM, overcoming the limitations of strict theoretical guarantees and providing actionable defense guarantees against diverse jailbreaking attacks.

Adarsh Kumarappan, Ayushi Mehrotra2026-03-10🤖 cs.LG

← Previous Next →

cs.LG