cs.LG papers | Gist.Science

Hindsight Credit Assignment for Long-Horizon LLM Agents

The paper introduces HCAPO, a novel framework that enhances long-horizon LLM agents by leveraging hindsight reasoning to refine step-level Q-values and employing a multi-scale advantage mechanism to address sparse reward challenges, thereby significantly outperforming state-of-the-art methods like GRPO on benchmarks such as WebShop and ALFWorld.

Hui-Ze Tan, Xiao-Wen Yang, Hao Chen, Jie-Jing Shao, Yi Wen, Yuteng Shen, Weihong Luo, Xiku Du, Lan-Zhe Guo, Yu-Feng Li2026-03-11🤖 cs.AI

Generalized Reduction to the Isotropy for Flexible Equivariant Neural Fields

This paper introduces a principled method to reduce $G$ -invariant functions on product spaces $X \times M$ to $H$ -invariant functions on $X$ alone, where $H$ is the isotropy subgroup of $M$ , thereby enabling flexible Equivariant Neural Fields to handle arbitrary group actions and heterogeneous product spaces without structural constraints.

Alejandro García-Castellanos, Gijs Bellaard, Remco Duits, Daniel Pelt, Erik J Bekkers2026-03-11🤖 cs.AI

On the Formal Limits of Alignment Verification

This paper establishes a fundamental trilemma in AI safety, proving that no verification procedure can simultaneously guarantee soundness, generality, and tractability, thereby demonstrating that formal alignment certification is impossible without relaxing at least one of these critical properties.

Ayushi Agarwal2026-03-11🤖 cs.LG

SPREAD: Subspace Representation Distillation for Lifelong Imitation Learning

The paper introduces SPREAD, a geometry-preserving framework for lifelong imitation learning that utilizes singular value decomposition to align policy representations within low-rank subspaces and a confidence-guided distillation strategy to mitigate catastrophic forgetting while achieving state-of-the-art performance on the LIBERO benchmark.

Kaushik Roy, Giovanni D'urso, Nicholas Lawrance, Brendan Tidd, Peyman Moghadam2026-03-11🤖 cs.LG

Micro-Diffusion Compression -- Binary Tree Tweedie Denoising for Online Probability Estimation

The paper presents Midicoth, a lossless compression system that enhances online probability estimation by applying a lightweight, multi-stage micro-diffusion denoising layer to correct systematic biases in adaptive statistical models through a data-efficient, binary-tree decomposition of byte predictions.

Roberto Tacconelli2026-03-11🤖 cs.LG

Multi-level meta-reinforcement learning with skill-based curriculum

This paper proposes a multi-level meta-reinforcement learning framework that systematically compresses Markov decision processes into hierarchical structures with skill-based curriculum learning to decouple sub-tasks, reduce stochasticity, and enable efficient transfer of skills across different problems and levels.

Sichen Yang (Johns Hopkins University), Mauro Maggioni (Johns Hopkins University)2026-03-11🤖 cs.AI

The Temporal Markov Transition Field

This paper introduces the Temporal Markov Transition Field (TMTF), a novel time series representation that overcomes the limitations of the global Markov Transition Field by partitioning data into temporal chunks to preserve regime-specific dynamics, thereby creating a structured image suitable for convolutional neural networks.

Michael Leznik2026-03-11🤖 cs.LG

SoftJAX & SoftTorch: Empowering Automatic Differentiation Libraries with Informative Gradients

This paper introduces SoftJAX and SoftTorch, open-source libraries that provide feature-complete, drop-in soft relaxations for hard, non-differentiable primitives in JAX and PyTorch, thereby enabling informative gradients for optimization tasks involving operations like thresholding, sorting, and Boolean logic.

Anselm Paulus, A. René Geist, Vít Musil, Sebastian Hoffmann, Onur Beker, Georg Martius2026-03-11🤖 cs.LG

Are Expressive Encoders Necessary for Discrete Graph Generation?

This paper introduces GenGNN, a modular message-passing framework that demonstrates expressive neural backbones like transformers are not strictly necessary for discrete graph generation, as diffusion models using GenGNN achieve competitive validity and superior inference speed on various datasets.

Jay Revolinsky, Harry Shomer, Jiliang Tang2026-03-11🤖 cs.AI

MASEval: Extending Multi-Agent Evaluation from Models to Systems

MASEval introduces a framework-agnostic library that shifts multi-agent evaluation from a model-centric to a system-centric approach, demonstrating through extensive experiments that implementation decisions regarding topology and orchestration impact performance as significantly as model selection.

Cornelius Emde, Alexander Rubinstein, Anmol Goel, Ahmed Heakl, Sangdoo Yun, Seong Joon Oh, Martin Gubri2026-03-11🤖 cs.AI

Expressivity-Efficiency Tradeoffs for Hybrid Sequence Models

This paper theoretically and empirically demonstrates that hybrid sequence models, which combine Transformers and state-space models, can provably solve core synthetic tasks with significantly fewer parameters and less memory than non-hybrid models while also achieving superior length generalization and out-of-distribution robustness.

John Cooper, Ilias Diakonikolas, Mingchen Ma, Frederic Sala2026-03-11🤖 cs.LG

APPLV: Adaptive Planner Parameter Learning from Vision-Language-Action Model

This paper proposes \textsc{applv}, a novel framework that leverages Vision-Language-Action models to dynamically predict and adapt classical planner parameters, thereby achieving superior navigation performance and generalization in highly constrained environments compared to existing methods.

Yuanjie Lu, Beichen Wang, Zhengqi Wu, Yang Li, Xiaomin Lin, Chengzhi Mao, Xuesu Xiao2026-03-11🤖 cs.LG

Why Channel-Centric Models are not Enough to Predict End-to-End Performance in Private 5G: A Measurement Campaign and Case Study

This paper demonstrates that channel-centric models, including ray-tracing simulators, fail to accurately predict end-to-end throughput in private 5G networks due to systematic over-estimation of MIMO spatial layers, whereas data-driven Gaussian process models trained on direct measurements provide significantly more reliable predictions for communication-aware robot planning.

Nils Jörgensen2026-03-11🤖 cs.LG

A New Modeling to Feature Selection Based on the Fuzzy Rough Set Theory in Normal and Optimistic States on Hybrid Information Systems

This paper introduces FSbuHD, a novel feature selection model for hybrid information systems that addresses the computational and noise limitations of traditional fuzzy rough set theory by reformulating the problem as an optimization task based on combined object distances, demonstrating superior efficiency and effectiveness in both normal and optimistic states across UCI datasets.

Mohammad Hossein Safarpour, Seyed Mohammad Alavi, Mohammad Izadikhah, Hossein Dibachi2026-03-11🤖 cs.AI

Cross-Domain Uncertainty Quantification for Selective Prediction: A Comprehensive Bound Ablation with Transfer-Informed Betting

This paper introduces Transfer-Informed Betting (TIB), a novel method that combines betting-based confidence sequences with cross-domain transfer learning to achieve tighter, data-efficient risk guarantees for selective prediction, demonstrating significant coverage improvements over existing bounds across multiple benchmarks and applications.

Abhinaba Basu2026-03-11🤖 cs.AI

FedLECC: Cluster- and Loss-Guided Client Selection for Federated Learning under Non-IID Data

FedLECC is a lightweight client selection strategy for federated learning under non-IID data that groups clients by label-distribution similarity and prioritizes those with higher local loss, thereby significantly improving test accuracy while reducing communication rounds and overhead.

Daniel M. Jimenez-Gutierrez, Giovanni Giunta, Mehrdad Hassanzadeh, Aris Anagnostopoulos, Ioannis Chatzigiannakis, Andrea Vitaletti2026-03-11🤖 cs.AI

Quantifying Memorization and Privacy Risks in Genomic Language Models

This paper introduces a comprehensive multi-vector privacy evaluation framework that quantifies memorization risks in Genomic Language Models by integrating perplexity-based detection, canary sequence extraction, and membership inference, revealing that these models exhibit measurable data leakage dependent on architecture and training dynamics.

Alexander Nemecek, Wenbiao Li, Xiaoqian Jiang, Jaideep Vaidya, Erman Ayday2026-03-11🤖 cs.LG

Uncovering a Winning Lottery Ticket with Continuously Relaxed Bernoulli Gates

This paper introduces a fully differentiable approach to discovering Strong Lottery Tickets by employing continuously relaxed Bernoulli gates to optimize sparsity via gradient descent on frozen weights, achieving significantly higher sparsity with minimal accuracy loss compared to existing non-differentiable methods like edge-popup.

Itamar Tsayag, Ofir Lindenbaum2026-03-11🤖 cs.AI

Vision-Language Models Encode Clinical Guidelines for Concept-Based Medical Reasoning

The paper introduces MedCBR, a novel framework that integrates clinical guidelines with vision-language models to enhance the interpretability and accuracy of medical image diagnosis by transforming visual features into guideline-conformant concepts and structured clinical narratives.

Mohamed Harmanani, Bining Long, Zhuoxin Guo, Paul F. R. Wilson, Amirhossein Sabour, Minh Nguyen Nhat To, Gabor Fichtinger, Purang Abolmaesumi, Parvin Mousavi2026-03-11🤖 cs.LG

Optimizing Reinforcement Learning Training over Digital Twin Enabled Multi-fidelity Networks

This paper proposes a hierarchical reinforcement learning framework that jointly optimizes antenna tilt angles and the data collection ratio between a physical network and its digital twin to maximize user data rates while minimizing communication overhead and delay.

Hanzhi Yu, Hasan Farooq, Julien Forgeat, Shruti Bothe, Kristijonas Cyras, Md Moin Uddin Chowdhury, Mingzhe Chen2026-03-11🤖 cs.LG

← Previous Next →