cs.LG papers | Gist.Science

Revealing Behavioral Plasticity in Large Language Models: A Token-Conditional Perspective

This paper introduces Token-Conditioned Reinforcement Learning (ToCoRL), a framework that leverages the intrinsic behavioral plasticity of Large Language Models to internalize and stabilize inference-time adaptations, enabling precise control over behavioral modes like switching from reasoning to direct answering without degrading overall capabilities.

Liyuan Mao, Le Yu, Jing Zhou, Chujie Zheng, Bowen Yu, Chang Gao, Shixuan Liu, An Yang, Weinan Zhang, JunYang Lin2026-03-10🤖 cs.LG

A Recipe for Stable Offline Multi-agent Reinforcement Learning

This paper identifies value-scale amplification as the primary cause of instability in non-linear value decomposition for offline multi-agent reinforcement learning and proposes a scale-invariant value normalization technique to stabilize training, ultimately providing a practical recipe to unlock the full potential of offline MARL.

Dongsu Lee, Daehee Lee, Amy Zhang2026-03-10🤖 cs.LG

Geometrically Constrained Outlier Synthesis

This paper introduces Geometrically Constrained Outlier Synthesis (GCOS), a training-time framework that generates virtual outliers in the feature space by respecting in-distribution manifold structures and using conformal shells to improve out-of-distribution detection robustness and provide formal error guarantees.

Daniil Karzanov, Marcin Detyniecki2026-03-10🤖 cs.LG

Meta-RL with Shared Representations Enables Fast Adaptation in Energy Systems

This paper introduces a novel Meta-RL framework featuring a hybrid actor-critic architecture with shared state representations and parameter-sharing mechanisms that significantly enhances sample efficiency and fast adaptation in non-stationary environments, as validated by superior performance on a decade-long real-world Building Energy Management Systems dataset.

Théo Zangato, Aomar Osmani, Pegah Alizadeh2026-03-10🤖 cs.LG

SYNAPSE: Framework for Neuron Analysis and Perturbation in Sequence Encoding

The paper introduces SYNAPSE, a systematic, training-free framework that analyzes and stress-tests Transformer models by extracting layer representations and applying forward-hook interventions to reveal domain-independent internal organization, functional stability through redundant neuron subsets, and specific vulnerabilities to small manipulations.

Jesús Sánchez Ochoa, Enrique Tomás Martínez Beltrán, Alberto Huertas Celdrán2026-03-10🤖 cs.LG

IronEngine: Towards General AI Assistant

This paper introduces IronEngine, a general AI assistant platform featuring a unified orchestration core and a three-phase pipeline that integrates diverse backends, adaptive memory, and extensive tooling to achieve high task completion rates while separating planning quality from execution capability.

Xi Mo2026-03-10🤖 cs.LG

Grow, Assess, Compress: Adaptive Backbone Scaling for Memory-Efficient Class Incremental Learning

This paper introduces GRACE, a novel dynamic scaling framework for Class Incremental Learning that adaptively balances model capacity through a cyclic "Grow, Assess, Compress" strategy to achieve state-of-the-art performance while significantly reducing memory overhead compared to purely expansion-based methods.

Adrian Garcia-Castañeda, Jon Irureta, Jon Imaz, Aizea Lojo2026-03-10🤖 cs.LG

A prospective clinical feasibility study of a conversational diagnostic AI in an ambulatory primary care clinic

This prospective feasibility study demonstrates that a conversational AI system (AMIE) can safely and effectively conduct clinical history-taking and generate diagnostic suggestions in a real-world urgent care setting, achieving high patient satisfaction and diagnostic accuracy comparable to primary care providers while requiring no real-time human intervention.

Peter Brodeur, Jacob M. Koshy, Anil Palepu, Khaled Saab, Ava Homiar, Roma Ruparel, Charles Wu, Ryutaro Tanno, Joseph Xu, Amy Wang, David Stutz, Hannah M. Ferrera, David Barrett, Lindsey Crowley, Jihyeon Lee, Spencer E. Rittner, Ellery Wulczyn, Selena K. Zhang, Elahe Vedadi, Christine G. Kohn, Kavita Kulkarni, Vinay Kadiyala, Sara Mahdavi, Wendy Du, Jessica Williams, David Feinbloom, Renee Wong, Tao Tu, Petar Sirkovic, Alessio Orlandi, Christopher Semturs, Yun Liu, Juraj Gottweis, Dale R. Webster, Joëlle Barral, Katherine Chou, Pushmeet Kohli, Avinatan Hassidim, Yossi Matias, James Manyika, Rob Fields, Jonathan X. Li, Marc L. Cohen, Vivek Natarajan, Mike Schaekermann, Alan Karthikesalingam, Adam Rodman2026-03-10🤖 cs.LG

LycheeCluster: Efficient Long-Context Inference with Structure-Aware Chunking and Hierarchical KV Indexing

LycheeCluster is a novel KV cache management method that employs structure-aware chunking and hierarchical indexing to transform cache retrieval into a logarithmic-time process, achieving up to a 3.6x inference speedup with minimal performance degradation for long-context LLMs.

Dongfang Li, Zixuan Liu, Gang Lin, Baotian Hu, Min Zhang2026-03-10🤖 cs.LG

The Boiling Frog Threshold: Criticality and Blindness in World Model-Based Anomaly Detection Under Gradual Drift

This paper investigates world model-based anomaly detection under gradual observation drift, revealing a universal sharp detection threshold that depends on the interaction between detector sensitivity, noise floor, and environment-specific dynamics, while identifying critical failure modes such as the undetectability of sinusoidal drift and agent collapse prior to detection.

Zhe Hong2026-03-10🤖 cs.LG

Adaptive Entropy-Driven Sensor Selection in a Camera-LiDAR Particle Filter for Single-Vessel Tracking

This paper presents an adaptive entropy-driven sensor selection policy within a camera-LiDAR particle filter that dynamically switches between modalities to optimize tracking accuracy and continuity for single-vessel surveillance, validated through real-world maritime deployment.

Andrei Starodubov, Yaqub Aris Prabowo, Andreas Hadjipieris, Ioannis Kyriakides, Roberto Galeazzi2026-03-10🤖 cs.LG

Data-Driven Priors for Uncertainty-Aware Deterioration Risk Prediction with Multimodal Data

This paper introduces $\texttt{MedCertAIn}$ , a novel predictive uncertainty framework that leverages data-driven priors derived from cross-modal similarities and modality-specific corruptions to significantly enhance both the performance and reliability of multimodal in-hospital risk prediction using MIMIC-IV and MIMIC-CXR datasets.

L. Julián Lechuga López, Tim G. J. Rudner, Farah E. Shamout2026-03-10🤖 cs.LG

Reasoning as Compression: Unifying Budget Forcing via the Conditional Information Bottleneck

This paper proposes a Conditional Information Bottleneck framework that reframes efficient Chain-of-Thought reasoning as a lossy compression problem, introducing a semantic prior and a reinforcement learning objective to prune redundant tokens while preserving essential logic and improving accuracy.

Fabio Valerio Massoli, Andrey Kuzmin, Arash Behboodi2026-03-10🤖 cs.LG

MUSA-PINN: Multi-scale Weak-form Physics-Informed Neural Networks for Fluid Flow in Complex Geometries

The paper introduces MUSA-PINN, a multi-scale weak-form Physics-Informed Neural Network that reformulates PDE constraints as integral conservation laws over hierarchical control volumes to overcome convergence pathologies and significantly improve accuracy and mass conservation in fluid flow simulations within complex geometries like Triply Periodic Minimal Surfaces.

Weizheng Zhang, Xunjie Xie, Hao Pan, Xiaowei Duan, Bingteng Sun, Qiang Du, Lin lu2026-03-10🤖 cs.LG

Integrating Lagrangian Neural Networks into the Dyna Framework for Reinforcement Learning

This paper proposes a model-based reinforcement learning framework that integrates Lagrangian neural networks into the Dyna architecture to enforce physical laws and improve prediction accuracy, demonstrating that state-estimation-based optimization converges faster than stochastic gradient-based methods during training.

Shreya Das, Kundan Kumar, Muhammad Iqbal, Outi Savolainen, Dominik Baumann, Laura Ruotsalainen, Simo Särkkä2026-03-10🤖 cs.LG

STRIDE: Structured Lagrangian and Stochastic Residual Dynamics via Flow Matching

The paper proposes STRIDE, a hybrid dynamics learning framework that combines a Lagrangian Neural Network for energy-consistent rigid-body mechanics with Conditional Flow Matching for stochastic residual interaction forces, achieving significant improvements in long-horizon prediction and contact force accuracy for robotic systems in unstructured environments.

Prakrut Kotecha, Ganga Nair B, Shishir Kolathaya2026-03-10🤖 cs.LG

X-AVDT: Audio-Visual Cross-Attention for Robust Deepfake Detection

This paper proposes X-AVDT, a robust deepfake detector that leverages internal audio-visual cross-attention cues accessed via DDIM inversion to achieve superior generalization across diverse and evolving synthesis paradigms, supported by the introduction of the new MMDF dataset.

Youngseo Kim, Kwan Yun, Seokhyeon Hong, Sihun Cha, Colette Suhjung Koo, Junyong Noh2026-03-10🤖 cs.LG

NN-OpInf: an operator inference approach using structure-preserving composable neural networks

The paper introduces NN-OpInf, a structure-preserving, composable neural network framework for non-intrusive reduced-order modeling that outperforms traditional polynomial methods in accuracy and stability for systems with non-polynomial nonlinearities, albeit at the cost of higher computational training requirements.

Eric Parish, Anthony Gruber, Patrick Blonigan, Irina Tezaur2026-03-10🤖 cs.LG

Pareto-Optimal Anytime Algorithms via Bayesian Racing

This paper introduces PolarBear, a Bayesian racing framework that identifies Pareto-optimal anytime algorithms by adaptively sampling temporal Plackett-Luce rankings to eliminate dominated candidates without requiring objective bounds, normalization, or known optima.

Jonathan Wurth, Helena Stegherr, Neele Kemper, Michael Heider, Jörg Hähner2026-03-10🤖 cs.LG

Efficient Credal Prediction through Decalibration

This paper introduces "decalibration," an efficient method that generates credal sets as probability intervals for complex foundation models without requiring computationally expensive retraining, thereby enabling robust uncertainty representation in safety-critical applications.

Paul Hofman, Timo Löhr, Maximilian Muschalik, Yusuf Sale, Eyke Hüllermeier2026-03-10🤖 cs.LG

← Previous Next →