cs.LG papers | Gist.Science

XInsight: Integrative Stage-Consistent Psychological Counseling Support Agents for Digital Well-Being

This paper introduces XInsight, a multi-agent framework that aligns psychological support with the Exploration-Insight-Action paradigm through a structured Reason-Intervene-Reflect cycle to enhance interpretability and therapeutic effectiveness in digital well-being applications, accompanied by the XInsight-Bench evaluation protocol.

Fei Wang, Jiangnan Yang, Junjie Chen, Yuxin Liu, Kun Li, Yanyan Wei, Dan Guo, Meng Wang2026-03-10🤖 cs.LG

vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM

This paper introduces vLLM Hook, an open-source plug-in that enables the programmable access and manipulation of internal model states within the vLLM inference engine to support advanced test-time alignment techniques such as adversarial prompt detection, enhanced RAG, and activation steering.

Ching-Yun Ko, Pin-Yu Chen2026-03-10🤖 cs.LG

Isotonic Layer: A Universal Framework for Generic Recommendation Debiasing

This paper introduces the Isotonic Layer, a novel differentiable framework that integrates piecewise linear fitting and learnable embeddings into neural architectures to enforce global monotonicity, thereby enabling granular, context-aware debiasing and improved calibration for large-scale recommendation systems.

Hailing Cheng, Yafang Yang, Hemeng Tao, Fengyu Zhang2026-03-10🤖 cs.LG

How Attention Sinks Emerge in Large Language Models: An Interpretability Perspective

This paper identifies a simple, semantics-free "P0 Sink Circuit" that emerges early in training to explain how Large Language Models develop attention sinks on the first token, suggesting this mechanism could serve as a signal for tracking pre-training convergence.

Runyu Peng, Ruixiao Li, Mingshu Chen, Yunhua Zhou, Qipeng Guo, Xipeng Qiu2026-03-10🤖 cs.LG

Hierarchical Latent Structures in Data Generation Process Unify Mechanistic Phenomena across Scale

This paper demonstrates that hierarchical structures within the data generation process, modeled via probabilistic context-free grammars, serve as a unifying explanation for the emergence of diverse mechanistic phenomena like induction heads, function vectors, and the Hydra effect in Transformer-based language models.

Jonas Rohweder, Subhabrata Dutta, Iryna Gurevych2026-03-10🤖 cs.LG

Hierarchical Embedding Fusion for Retrieval-Augmented Code Generation

This paper introduces Hierarchical Embedding Fusion (HEF), a two-stage framework that compresses repository code into a reusable hierarchy of dense vectors and maps them to learned pseudo-tokens, enabling low-latency, repository-aware code generation with accuracy comparable to traditional retrieval methods while significantly reducing inference costs.

Nikita Sorokin, Ivan Sedykh, Valentin Malykh2026-03-10🤖 cs.LG

FuzzingRL: Reinforcement Fuzz-Testing for Revealing VLM Failures

The paper introduces FuzzingRL, a framework that combines vision-language fuzzing with adversarial reinforcement fine-tuning to automatically generate diverse, challenging queries that systematically expose and degrade the performance of Vision Language Models.

Jiajun Xu, Jiageng Mao, Ang Qi, Weiduo Yuan, Alexander Romanus, Helen Xia, Vitor Campagnolo Guizilini, Yue Wang2026-03-10🤖 cs.LG

Switchable Activation Networks

This paper introduces Switchable Activation Networks (SWAN), a framework that equips neural units with input-dependent binary gates to dynamically allocate computation and learn structured activation patterns, thereby unifying sparsity, pruning, and adaptive inference to achieve efficient, accurate, and context-aware deep learning models.

Laha Ale, Ning Zhang, Scott A. King, Pingzhi Fan2026-03-10🤖 cs.LG

Khatri-Rao Clustering for Data Summarization

This paper introduces the Khatri-Rao clustering paradigm, which extends traditional centroid-based methods like k-Means and deep clustering by modeling centroids as interactions of multiple succinct protocentroids, thereby achieving more compact and accurate data summaries with reduced redundancy.

Martino Ciaperoni, Collin Leiber, Aristides Gionis, Heikki Mannila2026-03-10🤖 cs.LG

Scale Dependent Data Duplication

This paper demonstrates that data duplication is scale-dependent, revealing that as model capability and corpus size increase, semantically equivalent documents behave like exact duplicates by producing aligned gradients and causing accelerated semantic collisions, which leads to rapidly increasing training losses for larger models and necessitates new scaling laws to accurately predict performance.

Joshua Kazdan, Noam Levi, Rylan Schaeffer, Jessica Chudnovsky, Abhay Puri, Bo He, Mehmet Donmez, Sanmi Koyejo, David Donoho2026-03-10🤖 cs.LG

Know When You're Wrong: Aligning Confidence with Correctness for LLM Error Detection

This paper introduces a normalized confidence scoring framework based on output anchor tokens to detect LLM errors without external validation, revealing that while supervised fine-tuning yields well-calibrated confidence, reinforcement learning methods induce overconfidence, and proposing post-RL self-distillation to restore reliability for applications like adaptive retrieval-augmented generation.

Xie Xiaohu, Liu Xiaohu, Yao Benjamin2026-03-10🤖 cs.LG

Structure-Aware Set Transformers: Temporal and Variable-Type Attention Biases for Asynchronous Clinical Time Series

The paper introduces Structure-Aware Set Transformers (STAR), a novel architecture that enhances asynchronous clinical time series modeling by integrating parameter-efficient soft attention biases for temporal locality and variable-type affinity, thereby outperforming existing grid-based and set-based baselines on ICU prediction tasks while providing interpretable insights into temporal and variable interactions.

Joohyung Lee, Kwanhyung Lee, Changhun Kim, Eunho Yang2026-03-10🤖 cs.LG

LegoNet: Memory Footprint Reduction Through Block Weight Clustering

LegoNet is a post-training compression technique that clusters 4x4 weight blocks across entire neural network architectures to achieve memory footprint reductions of up to 128x with negligible accuracy loss, without requiring any retraining or architectural modifications.

Joseph Bingham, Noah Green, Saman Zonouz2026-03-10🤖 cs.LG

Multi-Agent DRL for V2X Resource Allocation: Disentangling Challenges and Benchmarking Solutions

This paper addresses the lack of systematic evaluation in Multi-Agent Deep Reinforcement Learning for C-V2X resource allocation by introducing a disentangled benchmark suite of interference games and diverse datasets to isolate specific challenges, ultimately identifying policy robustness and generalization across vehicular topologies as the primary hurdle and demonstrating the superiority of actor-critic methods over value-based approaches.

Siyuan Wang, Lei Lei, Pranav Maheshwari, Sam Bellefeuille, Kan Zheng, Dusit Niyato2026-03-10🤖 cs.LG

Scaling Strategy, Not Compute: A Stand-Alone, Open-Source StarCraft II Benchmark for Accessible Reinforcement Learning Research

To address the complexity gap between StarCraft II's full game and its mini-games, this paper introduces the Two-Bridge Map Suite, an open-source, lightweight benchmark that isolates tactical navigation and combat skills to enable accessible reinforcement learning research under realistic compute budgets.

Sourav Panda, Shreyash Kale, Tanmay Ambadkar, Abhinav Verma, Jonathan Dodge2026-03-10🤖 cs.LG

Valid Feature-Level Inference for Tabular Foundation Models via the Conditional Randomization Test

This paper introduces a practical method for obtaining finite-sample valid p-values for feature-level hypothesis testing in tabular data by combining the Conditional Randomization Test with the TabPFN foundation model, enabling statistical inference without model retraining or parametric assumptions.

Mohamed Salem2026-03-10🤖 cs.LG

CapTrack: Multifaceted Evaluation of Forgetting in LLM Post-Training

This paper introduces CapTrack, a capability-centric framework that redefines LLM forgetting as systematic behavioral drift rather than mere knowledge loss, revealing through a large-scale study that post-training significantly degrades robustness and default behaviors, with instruction fine-tuning causing the most pronounced effects.

Lukas Thede, Stefan Winzeck, Zeynep Akata, Jonathan Richard Schwarz2026-03-10🤖 cs.LG

A Novel Approach for Testing Water Safety Using Deep Learning Inference of Microscopic Images of Unincubated Water Samples

This paper introduces DeepScope, a deep learning-based system that analyzes microscopic images of unincubated water samples to detect fecal contamination in seconds with 93% accuracy and a cost of $0.44 per test, significantly outperforming traditional incubation methods in speed and affordability.

Sanjay Srinivasan2026-03-10🤖 cs.LG

Consensus is Not Verification: Why Crowd Wisdom Strategies Fail for LLM Truthfulness

The paper demonstrates that unlike in domains with external verifiers, scaling inference compute through crowd wisdom strategies fails to improve LLM truthfulness in unverified settings because correlated model errors and the inability to distinguish social prediction from truth verification cause aggregation to reinforce shared misconceptions rather than identify correct answers.

Yegor Denisov-Blanch, Joshua Kazdan, Jessica Chudnovsky, Rylan Schaeffer, Sheng Guan, Soji Adeshina, Sanmi Koyejo2026-03-10🤖 cs.LG

OptiRoulette Optimizer: A New Stochastic Meta-Optimizer for up to 5.3x Faster Convergence

The paper introduces OptiRoulette, a stochastic meta-optimizer that dynamically selects update rules from a pool during training, demonstrating significantly faster convergence and higher test accuracy across multiple image-classification benchmarks compared to a standard AdamW baseline.

Stamatis Mastromichalakis2026-03-10🤖 cs.LG

← Previous Next →