cs.LG papers | Gist.Science

Thickening-to-Thinning: Reward Shaping via Human-Inspired Learning Dynamics for LLM Reasoning

This paper introduces T2T (Thickening-to-Thinning), a dynamic reward shaping framework inspired by human learning dynamics that enhances LLM reasoning by encouraging longer, exploratory trajectories on incorrect attempts and penalizing length upon success, thereby outperforming standard baselines on mathematical benchmarks.

Wenze Lin, Zhen Yang, Xitai Jiang, Pony Ma, Gao Huang2026-03-10🤖 cs.LG

Inference-Time Backdoors via Hidden Instructions in LLM Chat Templates

This paper introduces a novel inference-time backdoor attack that exploits maliciously modified chat templates to compromise open-weight language models without altering weights or training data, demonstrating high success rates in degrading factual accuracy and inducing harmful outputs while evading current security scans.

Ariel Fogel, Omer Hofman, Eilon Cohen, Roman Vainshtein2026-03-10🤖 cs.LG

Hinge Regression Tree: A Newton Method for Oblique Regression Tree Splitting

This paper introduces the Hinge Regression Tree (HRT), a novel oblique decision tree method that reframes split learning as a non-linear least-squares problem solvable via a damped Newton method, offering provable convergence, universal approximation capabilities, and superior performance with compact structures compared to existing baselines.

Hongyi Li, Han Lin, Jun Xu2026-03-10🤖 cs.LG

Radial Müntz-Szász Networks: Neural Architectures with Learnable Power Bases for Multidimensional Singularities

This paper introduces Radial Müntz-Szász Networks (RMN), a highly parameter-efficient neural architecture that utilizes learnable radial power bases and a log-primitive to accurately model multidimensional singular fields like $1/r$ and $\log r$ , achieving significantly lower error rates than standard MLPs and SIREN on benchmark tasks while providing closed-form gradients for physics-informed learning.

Gnankan Landry Regis N'guessan, Bum Jun Kim2026-03-10🤖 cs.LG

SDFed: Bridging Local Global Discrepancy via Subspace Refinement and Divergence Control in Federated Prompt Learning

SDFed is a heterogeneous federated prompt learning framework that addresses local-global discrepancies by combining a fixed-length global prompt with variable-length local prompts, enhanced by subspace refinement and divergence control strategies to improve performance and robustness in privacy-sensitive, resource-constrained multi-party settings.

Yicheng Di, Wei Yuan, Tieke He, Yuan Liu, Hongzhi Yin2026-03-10🤖 cs.LG

Retrieval Pivot Attacks in Hybrid RAG: Measuring and Mitigating Amplified Leakage from Vector Seeds to Graph Expansion

This paper identifies and formalizes "Retrieval Pivot Attacks" in Hybrid RAG systems, demonstrating how vector-retrieved seeds can inadvertently pivot through knowledge graph links to cause cross-tenant data leakage, and proves that enforcing authorization specifically at the graph expansion boundary effectively mitigates this risk with minimal overhead.

Scott Thornton2026-03-10🤖 cs.LG

Diffusion-Guided Pretraining for Brain Graph Foundation Models

This paper proposes a unified diffusion-guided pretraining framework for brain graph foundation models that overcomes the limitations of existing methods by using diffusion to preserve semantic connectivity patterns during augmentation and to enable topology-aware global reconstruction, thereby achieving robust and transferable representations across diverse neuroimaging datasets.

Xinxu Wei, Rong Zhou, Lifang He, Yu Zhang2026-03-10🤖 cs.LG

Learning Page Order in Shuffled WOO Releases

This paper investigates document page reordering in heterogeneous Dutch freedom of information releases, identifying that while specialized models achieve high accuracy on short documents, seq2seq transformers fail to generalize to longer texts due to fundamental differences in required ordering strategies, a challenge effectively addressed through model specialization rather than curriculum learning.

Efe Kahraman, Giulio Tosato2026-03-10🤖 cs.LG

Discovering Semantic Latent Structures in Psychological Scales: A Response-Free Pathway to Efficient Simplification

This paper introduces a response-free framework that leverages natural language processing and topic modeling to automatically simplify psychological scales by identifying semantic latent structures, achieving an average 60.5% reduction in item count while preserving psychometric validity and construct alignment.

Bo Wang, Yuxuan Zhang, Yueqin Hu, Hanchao Hou, Kaiping Peng, Shiguang Ni2026-03-10🤖 cs.LG

TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers

TrasMuon is a novel optimizer that enhances the stability and convergence of Muon-style methods by integrating global RMS calibration and energy-based trust-region clipping to preserve near-isometric geometry while mitigating sensitivity to step-size hyperparameters and high-energy outliers.

Peng Cheng, Jiucheng Zang, Qingnan Li, Liheng Ma, Yufei Cui, Yingxue Zhang, Boxing Chen, Ming Jian, Wen Tong2026-03-10🤖 cs.LG

Benchmark Leakage Trap: Can We Trust LLM-based Recommendation?

This paper identifies and validates the critical issue of benchmark data leakage in LLM-based recommendation systems, demonstrating that exposure to evaluation data during training can artificially inflate performance metrics for domain-relevant leaks while degrading accuracy for irrelevant ones, thereby undermining the reliability of current evaluation practices.

Mingqiao Zhang, Qiyao Peng, Yumeng Wang, Chunyuan Liu, Hongtao Liu2026-03-10🤖 cs.LG

Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation

This paper introduces the Mean Velocity Policy (MVP), a novel one-step generative policy that employs an Instantaneous Velocity Constraint (IVC) to theoretically guarantee high expressiveness while achieving state-of-the-art performance and significantly faster training and inference speeds on challenging robotic manipulation tasks compared to existing flow-based baselines.

Guojian Zhan, Letian Tao, Pengcheng Wang, Yixiao Wang, Yiheng Li, Yuxin Chen, Hongyang Li, Masayoshi Tomizuka, Shengbo Eben Li2026-03-10🤖 cs.LG

Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference

The paper introduces Pawsterior, a variational flow-matching framework that enhances simulation-based inference by incorporating geometric confinement for structured domains and enabling the handling of discrete latent structures, thereby improving posterior fidelity and expanding applicability to complex physical systems.

Jorge Carrasco-Pollo, Floor Eijkelboom, Jan-Willem van de Meent2026-03-10🤖 cs.LG

Why Code, Why Now: Learnability, Computability, and the Real Limits of Machine Learning

This paper argues that the superior progress of code generation over reinforcement learning stems from code's dense, verifiable feedback structure, proposing a five-level hierarchy of learnability to demonstrate that the fundamental ceiling of machine learning progress depends more on a task's inherent learnability and information structure than on model scaling alone.

Zhimin Zhao2026-03-10🤖 cs.LG

LongAudio-RAG: Event-Grounded Question Answering over Multi-Hour Long Audio

LongAudio-RAG is a hybrid edge-cloud framework that enables precise, low-hallucination question answering over multi-hour audio streams by converting recordings into timestamped event records for SQL-based retrieval, which then grounds Large Language Model responses in structured evidence rather than raw audio.

Naveen Vakada, Kartik Hegde, Arvind Krishna Sridhar, Yinyi Guo, Erik Visser2026-03-10🤖 cs.LG

Accelerated Predictive Coding Networks via Direct Kolen-Pollack Feedback Alignment

This paper introduces Direct Kolen-Pollack Predictive Coding (DKP-PC), a novel algorithm that enhances the efficiency and scalability of biologically inspired predictive coding by establishing direct learnable feedback connections from the output to all hidden layers, thereby reducing error propagation time complexity from O(L) to O(1) while mitigating vanishing updates and maintaining local learning.

Davide Casnici, Martin Lefebvre, Justin Dauwels, Charlotte Frenkel2026-03-10🤖 cs.LG

On the Power of Source Screening for Learning Shared Feature Extractors

This paper demonstrates that strategically screening and training on a carefully selected subset of high-quality, relevant data sources is sufficient to achieve statistically optimal shared feature extraction, even when discarding a substantial portion of available data.

Leo Muxing Wang, Connor Mclaughlin, Lili Su2026-03-10🤖 cs.LG

Emotion Collider: Dual Hyperbolic Mirror Manifolds for Sentiment Recovery via Anti Emotion Reflection

The paper introduces Emotion Collider (EC-Net), a hyperbolic hypergraph framework that leverages Poincaré-ball embeddings, bidirectional message passing, and contrastive learning to achieve robust and noise-resilient multimodal sentiment analysis by preserving high-order semantic relations and enhancing class separation.

Rong Fu, Ziming Wang, Shuo Yin, Haiyun Wei, Kun Liu, Xianda Li, Zeli Su, Simon Fong2026-03-10🤖 cs.LG

ModalImmune: Immunity Driven Unlearning via Self Destructive Training

ModalImmune is a training framework that enhances the robustness of multimodal systems against input channel loss by intentionally collapsing selected modality information during training through a combination of adaptive regularization, targeted intervention, and certified meta-parameter adaptation.

Rong Fu, Jia Yee Tan, Zijian Zhang, Ziming Wang, Zhaolu Kang, Muge Qi, Shuning Zhang, Simon Fong2026-03-10🤖 cs.LG

Whole-Brain Connectomic Graph Model Enables Whole-Body Locomotion Control in Fruit Fly

This paper introduces FlyGM, a whole-brain connectomic graph model that leverages the exact static neural architecture of an adult fruit fly to achieve stable, sample-efficient whole-body locomotion control in embodied reinforcement learning without task-specific tuning.

Zehao Jin, Yaoye Zhu, Chen Zhang, Yanan Sui2026-03-10🤖 cs.LG

← Previous Next →