cs.LG papers | Gist.Science

GDR-learners: Orthogonal Learning of Generative Models for Potential Outcomes

This paper introduces GDR-learners, a flexible suite of generative models (including CNFs, CGANs, CVAEs, and CDMs) that achieve quasi-oracle efficiency and double robustness for estimating potential outcome distributions, thereby outperforming existing methods in both theoretical properties and empirical performance.

Valentyn Melnychuk, Stefan Feuerriegel2026-03-10🤖 cs.LG

CLAD-Net: Continual Activity Recognition in Multi-Sensor Wearable Systems

CLAD-Net is a continual learning framework for wearable human activity recognition that combines a self-supervised transformer for long-term memory and a supervised CNN with knowledge distillation to effectively mitigate catastrophic forgetting and handle label scarcity across diverse subjects.

Reza Rahimi Azghan, Gautham Krishna Gudur, Mohit Malu, Edison Thomaz, Giulia Pedrielli, Pavan Turaga, Hassan Ghasemzadeh2026-03-10🤖 cs.LG

Generative Evolutionary Meta-Solver (GEMS): Scalable Surrogate-Free Multi-Agent Reinforcement Learning

The paper introduces Generative Evolutionary Meta-Solver (GEMS), a scalable, surrogate-free multi-agent reinforcement learning framework that replaces explicit policy populations with a compact generator and latent anchors to achieve significantly faster training, lower memory usage, and higher rewards than traditional methods like PSRO while maintaining game-theoretic guarantees.

Alakh Sharma, Gaurish Trivedi, Kartikey Singh Bhandari, Yash Sinha, Dhruv Kumar, Pratik Narang, Jagat Sesh Challa2026-03-10🤖 cs.LG

FS-KAN: Permutation Equivariant Kolmogorov-Arnold Networks via Function Sharing

This paper introduces FS-KAN, a principled framework that constructs permutation equivariant and invariant Kolmogorov-Arnold Networks via function sharing, offering superior data efficiency and interpretability while maintaining the expressive power of standard parameter-sharing networks.

Ran Elbaz, Guy Bar-Shalom, Yam Eitan, Fabrizio Frasca, Haggai Maron2026-03-10🤖 cs.LG

Overlap-Adaptive Regularization for Conditional Average Treatment Effect Estimation

This paper introduces Overlap-Adaptive Regularization (OAR), a novel method that enhances the performance of existing CATE meta-learners in low-overlap regions by proportionally increasing regularization based on overlap weights, while offering flexible, debiased variants that preserve Neyman-orthogonality for robust inference.

Valentyn Melnychuk, Dennis Frauen, Jonas Schweisthal, Stefan Feuerriegel2026-03-10🤖 cs.LG

Cold-Start Active Correlation Clustering

This paper addresses the cold-start scenario in active correlation clustering, where no initial pairwise similarities are available, by proposing a coverage-aware method that encourages diversity to efficiently query similarities and achieve effective clustering.

Linus Aronsson, Han Wu, Morteza Haghir Chehreghani2026-03-10🤖 cs.LG

Feedback Control for Small Budget Pacing

This paper proposes a principled feedback control method combining bucketized hysteresis and proportional feedback to achieve stable, accurate, and scalable budget pacing for online advertising campaigns, particularly improving performance for small-budget scenarios by significantly reducing pacing errors and volatility compared to existing baselines.

Sreeja Apparaju, Yichuan Niu, Xixi Qi2026-03-10🤖 cs.LG

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents

This paper introduces and empirically validates the concept of "misevolution," demonstrating that self-evolving LLM agents face widespread, emergent risks across model, memory, tool, and workflow pathways that can lead to safety degradation and unintended vulnerabilities, thereby highlighting an urgent need for new safety paradigms.

Shuai Shao, Qihan Ren, Chen Qian, Boyi Wei, Dadi Guo, Jingyi Yang, Xinhao Song, Linfeng Zhang, Weinan Zhang, Dongrui Liu, Jing Shao2026-03-10🤖 cs.LG

An Orthogonal Learner for Individualized Outcomes in Markov Decision Processes

This paper introduces the DRQ-learner, a novel meta-learner for estimating individualized potential outcomes in Markov Decision Processes that combines double robustness, Neyman orthogonality, and quasi-oracle efficiency to outperform existing state-of-the-art methods in sequential decision-making.

Emil Javurek, Valentyn Melnychuk, Jonas Schweisthal, Konstantin Hess, Dennis Frauen, Stefan Feuerriegel2026-03-10🤖 cs.LG

Privately Estimating Black-Box Statistics

This paper presents a differentially private scheme for estimating black-box statistics that effectively trades off between statistical efficiency and oracle efficiency, accompanied by lower bounds demonstrating the near-optimality of this approach.

Günter F. Steinke, Thomas Steinke2026-03-10🤖 cs.LG

Stochastic Self-Organization in Multi-Agent Systems

The paper introduces SelfOrg, a training-free framework that enables multi-agent systems to self-organize by dynamically constructing response-conditioned communication graphs based on Shapley value approximations, thereby optimizing collaboration and significantly improving performance—especially with weaker LLMs—without relying on fixed topologies or external supervision.

Nurbek Tastan, Samuel Horvath, Karthik Nandakumar2026-03-10🤖 cs.LG

CroSTAta: Cross-State Transition Attention Transformer for Robotic Manipulation

The paper introduces CroSTAta, a Cross-State Transition Attention Transformer that enhances robotic manipulation robustness by employing a novel State Transition Attention mechanism to model temporal structures like failure and recovery patterns, outperforming standard attention and sequential models in simulation.

Giovanni Minelli, Giulio Turrisi, Victor Barasuol, Claudio Semini2026-03-10🤖 cs.LG

Double projection for reconstructing dynamical systems: between stochastic and deterministic regimes

This paper introduces a "double projection" method within dynamical variational autoencoders that simultaneously estimates system state trajectories and noise time series from data, enabling effective multi-step reconstruction and learning of low-dimensional stochastic models across various benchmark problems.

Viktor Sip, Martin Breyton, Spase Petkoski, Viktor Jirsa2026-03-10🤖 cs.LG

Automated Extraction of Material Properties using LLM-based AI Agents

This study presents an automated, cost-effective LLM-based agentic workflow that successfully extracts over 27,000 thermoelectric and structural property records from approximately 10,000 scientific articles, creating the largest LLM-curated dataset to date and establishing a scalable foundation for data-driven materials discovery.

Subham Ghosh, Abhishek Tewari2026-03-10🔬 cond-mat.mtrl-sci

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

The paper introduces DialTree, a tree-based dialogue reinforcement learning framework that autonomously discovers diverse and effective multi-turn attack strategies against large language models, significantly outperforming existing single-turn or template-based red-teaming methods.

Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar, Miguel Ballesteros, Alan Ritter, Dan Roth2026-03-10🤖 cs.LG

An Senegalese Legal Texts Structuration Using LLM-augmented Knowledge Graph

This study leverages large language models to extract and structure nearly 8,000 articles from Senegalese legal texts into a comprehensive knowledge graph, thereby enhancing access to judicial information and clarifying rights and responsibilities for citizens and legal professionals.

Oumar Kane, Mouhamad M. Allaya, Dame Samb + 1 more2026-03-10💬 cs.CL

The Role of Feature Interactions in Graph-based Tabular Deep Learning

This paper demonstrates that current graph-based tabular deep learning methods fail to accurately recover underlying feature interaction structures despite their focus on predictive accuracy, and shows that explicitly modeling the true graph structure significantly improves prediction performance.

Elias Dubbeldam, Reza Mohammadi, Marit Schoonhoven, S. Ilker Birbil2026-03-10🤖 cs.LG

Wasserstein Gradient Flows for Scalable and Regularized Barycenter Computation

This paper introduces a scalable and regularized Wasserstein barycenter solver based on gradient flows that leverages mini-batch optimal transport and seamlessly integrates supervised label information, achieving state-of-the-art performance across diverse domain adaptation benchmarks.

Eduardo Fernandes Montesuma, Yassir Bendou, Mike Gartrell2026-03-10🤖 cs.LG

Pretraining in Actor-Critic Reinforcement Learning for Robot Locomotion

This paper proposes a pretraining-finetuning paradigm for robot locomotion that leverages a task-agnostic exploration strategy to train a Proprioceptive Inverse Dynamics Model (PIDM), which is then used to warm-start actor-critic algorithms like PPO, resulting in significant improvements in sample efficiency and task performance across diverse robot embodiments.

Jiale Fan, Andrei Cramariuc, Tifanny Portela, Marco Hutter2026-03-10🤖 cs.LG

ARM-FM: Automated Reward Machines via Foundation Models for Compositional Reinforcement Learning

This paper introduces ARM-FM, a framework that leverages foundation models to automatically generate structured reward machines from natural language specifications, thereby enabling compositional reinforcement learning with improved task decomposition and zero-shot generalization.

Roger Creus Castanyer, Faisal Mohamed, Pablo Samuel Castro, Cyrus Neary, Glen Berseth2026-03-10🤖 cs.LG

← Previous Next →