cs.LG papers | Gist.Science

Sparse Offline Reinforcement Learning with Corruption Robustness

This paper proposes actor-critic methods with sparse robust estimator oracles to achieve the first non-vacuous guarantees for learning near-optimal policies in high-dimensional sparse offline reinforcement learning under strong data corruption and single-policy concentrability, overcoming the limitations of traditional Least Square Value Iteration approaches in such regimes.

Nam Phuong Tran, Andi Nika, Goran Radanovic, Long Tran-Thanh, Debmalya Mandal2026-03-10🤖 cs.LG

Group Cross-Correlations with Faintly Constrained Filters

This paper proposes weaker constraints on filters for group convolutional neural networks that reduce the required number of nodes while resolving incompatibilities with non-compact stabilizers and generalizing results to non-transitive group actions and non-unimodular groups.

Benedikt Fluhr2026-03-10🤖 cs.LG

Reliable Grid Forecasting: State Space Models for Safety-Critical Energy Systems

This paper introduces an operator-legible evaluation framework centered on under-prediction risk to demonstrate that standard accuracy metrics fail to capture safety-critical grid forecasting needs, revealing that while explicit weather integration improves reliability, unconstrained probabilistic models often induce "fake safety" through excessive inflation, a problem solved by new Bias/OPR-constrained objectives.

Sunki Hong, Jisoo Lee2026-03-10⚡ eess

From Mice to Trains: Amortized Bayesian Inference on Graph Data

This paper proposes an amortized Bayesian inference framework for graph-structured data that combines permutation-invariant summary networks with neural posterior estimators to enable fast, likelihood-free inference on node, edge, and graph-level parameters, demonstrating its effectiveness through evaluations on synthetic benchmarks and real-world applications in biology and logistics.

Svenja Jedhoff, Elizaveta Semenova, Aura Raulo, Anne Meyer, Paul-Christian Bürkner2026-03-10🤖 cs.LG

DevBench: A Realistic, Developer-Informed Benchmark for Code Generation Models

DevBench is a realistic, telemetry-driven benchmark comprising 1,800 instances across six languages that evaluates LLMs on code completion tasks with a focus on ecological validity, contamination-free assessment, and detailed diagnostic insights to guide practical model selection and development.

Pareesa Ameneh Golnari, Adarsh Kumarappan, Wen Wen, Xiaoyu Liu, Gabriel Ryan, Yuting Sun, Shengyu Fu, Elsie Nallipogu2026-03-10🤖 cs.LG

A Component-Based Survey of Interactions between Large Language Models and Multi-Armed Bandits

This paper presents the first component-level survey systematically reviewing the bidirectional interactions between Large Language Models and Multi-Armed Bandits, highlighting how MAB algorithms optimize LLM workflows while LLMs redefine core MAB components to enhance adaptive decision-making.

Siguang Chen, Chunli Lv, Miao Xie2026-03-10🤖 cs.LG

ELSA: Efficient LLM-Centric Split Aggregation for Privacy-Aware Hierarchical Federated Learning over the Network Edge

ELSA is a novel framework that integrates split learning and hierarchical federated learning with client clustering, dynamic model splitting, and privacy-preserving communication sketches to enable efficient, robust, and privacy-aware fine-tuning of large language models on resource-constrained edge networks.

Xiaohong Yang, Tong Xie, Minghui Liwang, Chikai Shang, Yang Lu, Zhenzhen Jiao, Liqun Fu, Seyyedali Hosseinalipour2026-03-10🤖 cs.LG

Continuous-Flow Data-Rate-Aware CNN Inference on FPGA

This paper proposes a novel data-rate-aware continuous-flow architecture for CNN inference on FPGAs that mitigates hardware underutilization caused by data reduction in pooling and strided convolution layers by interleaving signals and sharing resources, thereby enabling the high-throughput implementation of complex models like MobileNet on a single device.

Tobias Habermann, Michael Mecik, Zhenyu Wang, César David Vera, Martin Kumm, Mario Garrido2026-03-10🤖 cs.LG

MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference

MeanCache is a training-free framework that accelerates Flow Matching inference by replacing instantaneous velocity caching with an average-velocity approach using cached Jacobian-vector products and a trajectory-stability scheduling strategy, achieving significant speedups (up to 4.56X) while maintaining high generation quality across models like FLUX.1 and HunyuanVideo.

Huanlin Gao, Ping Chen, Fuyuan Shi, Ruijia Wu, Li YanTao, Qiang Hui, Yuren You, Ting Lu, Chao Tan, Shaoan Zhao, Zhaoxiang Liu, Fang Zhao, Kai Wang, Shiguo Lian2026-03-10🤖 cs.LG

PASS: Certified Subset Repair for Classical and Quantum Pairwise Constrained Clustering

PASS is a scalable framework for pairwise-constrained k-means clustering that optimizes a small working subset while formally certifying the feasibility of cannot-link constraints via list-coloring, thereby enabling efficient classical and quantum solutions with verifiable repair mechanisms for infeasible instances.

Pedro Chumpitaz-Flores, My Duong, Ying Mao, Kaixun Hua2026-03-10🤖 cs.LG

Model-Free Neural State Estimation in Nonlinear Dynamical Systems: Comparing Neural and Classical Filters

This paper presents a systematic empirical comparison demonstrating that model-free neural estimators, particularly state-space models, achieve state estimation performance comparable to strong nonlinear Kalman filters in nonlinear dynamical systems while offering significantly higher inference throughput.

Zhuochen Liu, Hans Walker, Rahul Jain2026-03-10🤖 cs.LG

TimeSliver : Symbolic-Linear Decomposition for Explainable Time Series Classification

TimeSliver is a novel explainable deep learning framework for time-series classification that jointly leverages raw data and symbolic abstraction to linearly encode temporal segment contributions, achieving superior attribution accuracy and competitive predictive performance compared to existing methods.

Akash Pandey, Payal Mohapatra, Wei Chen, Qi Zhu, Sinan Keten2026-03-10🤖 cs.LG

Transferable Graph Condensation from the Causal Perspective

This paper proposes TGCC, a novel causal-invariance-based graph dataset condensation method that extracts domain-invariant features and injects them via spectral contrastive learning to significantly improve performance in cross-task and cross-domain scenarios while maintaining state-of-the-art results in single-task settings.

Huaming Du, Yijie Huang, Su Yao, Yiying Wang, Yueyang Zhou, Jingwen Yang, Jinshi Zhang, Han Ji, Yu Zhao, Guisong Liu, Hegui Zhang, Carl Yang, Gang Kou2026-03-10🤖 cs.LG

FlowSymm: Physics Aware, Symmetry Preserving Graph Attention for Network Flow Completion

FlowSymm is a novel graph attention architecture that recovers missing network flows by combining a physics-aware, symmetry-preserving group-action framework with a lightweight Tikhonov refinement, ensuring exact adherence to local conservation laws while outperforming state-of-the-art methods across transportation, energy, and mobility benchmarks.

Ege Demirci, Francesco Bullo, Ananthram Swami, Ambuj Singh2026-03-10🤖 cs.LG

Mem-T: Densifying Rewards for Long-Horizon Memory Agents

Mem-T introduces an autonomous memory agent trained via the MoT-GRPO framework, which densifies sparse long-horizon rewards through tree-guided backpropagation to achieve superior performance and efficiency compared to existing memory management systems.

Yanwei Yue, Boci Peng, Xuanbo Fan, Jiaxin Guo, Qiankun Li, Yan Zhang2026-03-10🤖 cs.LG

Bitcoin Price Prediction using Machine Learning and Combinatorial Fusion Analysis

This paper proposes a Bitcoin price prediction model using Combinatorial Fusion Analysis (CFA) to integrate diverse machine learning models via rank-score characteristics and weighted combinations, achieving a superior Mean Absolute Percentage Error (MAPE) of 0.19% that outperforms individual models and existing prediction methods.

Yuanhong Wu, Wei Ye, Jingyan Xu, D. Frank Hsu2026-03-10🤖 cs.LG

In-Run Data Shapley for Adam Optimizer

This paper introduces Adam-Aware In-Run Data Shapley, a novel method that overcomes the limitations of SGD-based attribution in adaptive optimizers by deriving a closed-form approximation and a Linearized Ghost Approximation to achieve near-perfect fidelity in data contribution estimation while maintaining high training efficiency.

Meng Ding, Zeqing Zhang, Di Wang, Lijie Hu2026-03-10🤖 cs.LG

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? A Study of Hierarchical Gating and Calibration

This paper investigates whether Schwartz higher-order values improve sentence-level human value detection, finding that while hierarchical gating offers limited benefits, calibration techniques and hybrid ensembles significantly boost performance, suggesting the value hierarchy is more effective as an inductive bias than a rigid routing mechanism.

Víctor Yeste, Paolo Rosso2026-03-10🤖 cs.LG

LatentMem: Customizing Latent Memory for Multi-Agent Systems

This paper introduces LatentMem, a learnable multi-agent memory framework that addresses memory homogenization and information overload by using an experience bank and a memory composer to generate customized, token-efficient latent memories, further optimized via Latent Memory Policy Optimization (LMPO) to significantly enhance multi-agent system performance.

Muxin Fu, Xiangyuan Xue, Yafu Li, Zefeng He, Siyuan Huang, Xiaoye Qu, Yu Cheng, Yang Yang2026-03-10🤖 cs.LG

Thickening-to-Thinning: Reward Shaping via Human-Inspired Learning Dynamics for LLM Reasoning

This paper introduces T2T (Thickening-to-Thinning), a dynamic reward shaping framework inspired by human learning dynamics that enhances LLM reasoning by encouraging longer, exploratory trajectories on incorrect attempts and penalizing length upon success, thereby outperforming standard baselines on mathematical benchmarks.

Wenze Lin, Zhen Yang, Xitai Jiang, Pony Ma, Gao Huang2026-03-10🤖 cs.LG

← Previous Next →