cs.LG papers | Gist.Science

Mem-T: Densifying Rewards for Long-Horizon Memory Agents

Mem-T introduces an autonomous memory agent trained via the MoT-GRPO framework, which densifies sparse long-horizon rewards through tree-guided backpropagation to achieve superior performance and efficiency compared to existing memory management systems.

Yanwei Yue, Boci Peng, Xuanbo Fan, Jiaxin Guo, Qiankun Li, Yan Zhang2026-03-10🤖 cs.LG

Bitcoin Price Prediction using Machine Learning and Combinatorial Fusion Analysis

This paper proposes a Bitcoin price prediction model using Combinatorial Fusion Analysis (CFA) to integrate diverse machine learning models via rank-score characteristics and weighted combinations, achieving a superior Mean Absolute Percentage Error (MAPE) of 0.19% that outperforms individual models and existing prediction methods.

Yuanhong Wu, Wei Ye, Jingyan Xu, D. Frank Hsu2026-03-10🤖 cs.LG

In-Run Data Shapley for Adam Optimizer

This paper introduces Adam-Aware In-Run Data Shapley, a novel method that overcomes the limitations of SGD-based attribution in adaptive optimizers by deriving a closed-form approximation and a Linearized Ghost Approximation to achieve near-perfect fidelity in data contribution estimation while maintaining high training efficiency.

Meng Ding, Zeqing Zhang, Di Wang, Lijie Hu2026-03-10🤖 cs.LG

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? A Study of Hierarchical Gating and Calibration

This paper investigates whether Schwartz higher-order values improve sentence-level human value detection, finding that while hierarchical gating offers limited benefits, calibration techniques and hybrid ensembles significantly boost performance, suggesting the value hierarchy is more effective as an inductive bias than a rigid routing mechanism.

Víctor Yeste, Paolo Rosso2026-03-10🤖 cs.LG

LatentMem: Customizing Latent Memory for Multi-Agent Systems

This paper introduces LatentMem, a learnable multi-agent memory framework that addresses memory homogenization and information overload by using an experience bank and a memory composer to generate customized, token-efficient latent memories, further optimized via Latent Memory Policy Optimization (LMPO) to significantly enhance multi-agent system performance.

Muxin Fu, Xiangyuan Xue, Yafu Li, Zefeng He, Siyuan Huang, Xiaoye Qu, Yu Cheng, Yang Yang2026-03-10🤖 cs.LG

Thickening-to-Thinning: Reward Shaping via Human-Inspired Learning Dynamics for LLM Reasoning

This paper introduces T2T (Thickening-to-Thinning), a dynamic reward shaping framework inspired by human learning dynamics that enhances LLM reasoning by encouraging longer, exploratory trajectories on incorrect attempts and penalizing length upon success, thereby outperforming standard baselines on mathematical benchmarks.

Wenze Lin, Zhen Yang, Xitai Jiang, Pony Ma, Gao Huang2026-03-10🤖 cs.LG

Inference-Time Backdoors via Hidden Instructions in LLM Chat Templates

This paper introduces a novel inference-time backdoor attack that exploits maliciously modified chat templates to compromise open-weight language models without altering weights or training data, demonstrating high success rates in degrading factual accuracy and inducing harmful outputs while evading current security scans.

Ariel Fogel, Omer Hofman, Eilon Cohen, Roman Vainshtein2026-03-10🤖 cs.LG

Hinge Regression Tree: A Newton Method for Oblique Regression Tree Splitting

This paper introduces the Hinge Regression Tree (HRT), a novel oblique decision tree method that reframes split learning as a non-linear least-squares problem solvable via a damped Newton method, offering provable convergence, universal approximation capabilities, and superior performance with compact structures compared to existing baselines.

Hongyi Li, Han Lin, Jun Xu2026-03-10🤖 cs.LG

Radial Müntz-Szász Networks: Neural Architectures with Learnable Power Bases for Multidimensional Singularities

This paper introduces Radial Müntz-Szász Networks (RMN), a highly parameter-efficient neural architecture that utilizes learnable radial power bases and a log-primitive to accurately model multidimensional singular fields like $1/r$ and $\log r$ , achieving significantly lower error rates than standard MLPs and SIREN on benchmark tasks while providing closed-form gradients for physics-informed learning.

Gnankan Landry Regis N'guessan, Bum Jun Kim2026-03-10🤖 cs.LG

SDFed: Bridging Local Global Discrepancy via Subspace Refinement and Divergence Control in Federated Prompt Learning

SDFed is a heterogeneous federated prompt learning framework that addresses local-global discrepancies by combining a fixed-length global prompt with variable-length local prompts, enhanced by subspace refinement and divergence control strategies to improve performance and robustness in privacy-sensitive, resource-constrained multi-party settings.

Yicheng Di, Wei Yuan, Tieke He, Yuan Liu, Hongzhi Yin2026-03-10🤖 cs.LG

Retrieval Pivot Attacks in Hybrid RAG: Measuring and Mitigating Amplified Leakage from Vector Seeds to Graph Expansion

This paper identifies and formalizes "Retrieval Pivot Attacks" in Hybrid RAG systems, demonstrating how vector-retrieved seeds can inadvertently pivot through knowledge graph links to cause cross-tenant data leakage, and proves that enforcing authorization specifically at the graph expansion boundary effectively mitigates this risk with minimal overhead.

Scott Thornton2026-03-10🤖 cs.LG

Diffusion-Guided Pretraining for Brain Graph Foundation Models

This paper proposes a unified diffusion-guided pretraining framework for brain graph foundation models that overcomes the limitations of existing methods by using diffusion to preserve semantic connectivity patterns during augmentation and to enable topology-aware global reconstruction, thereby achieving robust and transferable representations across diverse neuroimaging datasets.

Xinxu Wei, Rong Zhou, Lifang He, Yu Zhang2026-03-10🤖 cs.LG

Learning Page Order in Shuffled WOO Releases

This paper investigates document page reordering in heterogeneous Dutch freedom of information releases, identifying that while specialized models achieve high accuracy on short documents, seq2seq transformers fail to generalize to longer texts due to fundamental differences in required ordering strategies, a challenge effectively addressed through model specialization rather than curriculum learning.

Efe Kahraman, Giulio Tosato2026-03-10🤖 cs.LG

Discovering Semantic Latent Structures in Psychological Scales: A Response-Free Pathway to Efficient Simplification

This paper introduces a response-free framework that leverages natural language processing and topic modeling to automatically simplify psychological scales by identifying semantic latent structures, achieving an average 60.5% reduction in item count while preserving psychometric validity and construct alignment.

Bo Wang, Yuxuan Zhang, Yueqin Hu, Hanchao Hou, Kaiping Peng, Shiguang Ni2026-03-10🤖 cs.LG

TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers

TrasMuon is a novel optimizer that enhances the stability and convergence of Muon-style methods by integrating global RMS calibration and energy-based trust-region clipping to preserve near-isometric geometry while mitigating sensitivity to step-size hyperparameters and high-energy outliers.

Peng Cheng, Jiucheng Zang, Qingnan Li, Liheng Ma, Yufei Cui, Yingxue Zhang, Boxing Chen, Ming Jian, Wen Tong2026-03-10🤖 cs.LG

Benchmark Leakage Trap: Can We Trust LLM-based Recommendation?

This paper identifies and validates the critical issue of benchmark data leakage in LLM-based recommendation systems, demonstrating that exposure to evaluation data during training can artificially inflate performance metrics for domain-relevant leaks while degrading accuracy for irrelevant ones, thereby undermining the reliability of current evaluation practices.

Mingqiao Zhang, Qiyao Peng, Yumeng Wang, Chunyuan Liu, Hongtao Liu2026-03-10🤖 cs.LG

Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation

This paper introduces the Mean Velocity Policy (MVP), a novel one-step generative policy that employs an Instantaneous Velocity Constraint (IVC) to theoretically guarantee high expressiveness while achieving state-of-the-art performance and significantly faster training and inference speeds on challenging robotic manipulation tasks compared to existing flow-based baselines.

Guojian Zhan, Letian Tao, Pengcheng Wang, Yixiao Wang, Yiheng Li, Yuxin Chen, Hongyang Li, Masayoshi Tomizuka, Shengbo Eben Li2026-03-10🤖 cs.LG

Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference

The paper introduces Pawsterior, a variational flow-matching framework that enhances simulation-based inference by incorporating geometric confinement for structured domains and enabling the handling of discrete latent structures, thereby improving posterior fidelity and expanding applicability to complex physical systems.

Jorge Carrasco-Pollo, Floor Eijkelboom, Jan-Willem van de Meent2026-03-10🤖 cs.LG

Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

This paper argues that the superior progress of code generation over reinforcement learning stems from code's dense, verifiable feedback structure, proposing a five-level hierarchy of learnability to demonstrate that the fundamental ceiling of machine learning progress depends more on a task's inherent learnability and information structure than on model scaling alone.

Zhimin Zhao2026-03-10🤖 cs.LG

LongAudio-RAG: Event-Grounded Question Answering over Multi-Hour Long Audio

LongAudio-RAG is a hybrid edge-cloud framework that enables precise, low-hallucination question answering over multi-hour audio streams by converting recordings into timestamped event records for SQL-based retrieval, which then grounds Large Language Model responses in structured evidence rather than raw audio.

Naveen Vakada, Kartik Hegde, Arvind Krishna Sridhar, Yinyi Guo, Erik Visser2026-03-10🤖 cs.LG

← Previous Next →