cs.AI papers | Gist.Science

Adaptive Rollout Allocation for Online Reinforcement Learning with Verifiable Rewards

This paper introduces VIP, a Variance-Informed Predictive allocation strategy that dynamically optimizes rollout distribution across training prompts using Gaussian process-based variance estimation to minimize gradient variance and significantly improve sampling efficiency in online reinforcement learning with verifiable rewards.

Hieu Trung Nguyen, Bao Nguyen, Wenao Ma + 3 more2026-03-06💻 cs

Towards Exploratory and Focused Manipulation with Bimanual Active Perception: A New Problem, Benchmark and Strategy

This paper introduces the Exploratory and Focused Manipulation (EFM) problem to address visual occlusion in robot manipulation, proposing the EFM-10 benchmark and a Bimanual Active Perception (BAP) strategy that effectively leverages dual-arm coordination for active vision and force sensing.

Yuxin He, Ruihao Zhang, Tianao Shen + 2 more2026-03-06💻 cs

On the Non-Identifiability of Steering Vectors in Large Language Models

This paper demonstrates that steering vectors in large language models are fundamentally non-identifiable, as numerous distinct interventions—including orthogonal perturbations—produce behaviorally indistinguishable results, thereby revealing inherent limits in interpreting these vectors as unique internal representations without additional structural constraints.

Sohan Venkatesh, Ashish Mahendran Kurapath2026-03-06💻 cs

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

LatentChem introduces a latent reasoning interface that decouples chemical computation from textual generation, enabling models to perform multi-step reasoning in continuous latent space which spontaneously emerges as a more efficient and accurate alternative to explicit Chain-of-Thought, achieving a 59.88% win rate and 10.84 $\times$ speedup over baselines.

Xinwu Ye, Yicheng Mao, Jia Zhang + 16 more2026-03-06🔬 physics

Supervised Metric Regularization Through Alternating Optimization for Multi-Regime Physics-Informed Neural Networks

This paper introduces Topology-Aware PINNs (TAPINN), a novel framework that employs supervised metric regularization and alternating optimization to effectively resolve spectral bias and mode collapse in multi-regime physics-informed neural networks, achieving superior convergence stability and accuracy compared to standard and hypernetwork-based baselines.

Enzo Nicolas Spotorno, Josafat Ribeiro Leal, Antonio Augusto Frohlich2026-03-06🔬 physics

Empirical Stability Analysis of Kolmogorov-Arnold Networks in Hard-Constrained Recurrent Physics-Informed Discovery

This paper empirically demonstrates that while Kolmogorov-Arnold Networks (KANs) can compete with MLPs on simple univariate residuals in hard-constrained recurrent physics-informed architectures, they suffer from severe hyperparameter fragility, instability in deeper configurations, and consistent failure on multiplicative terms, ultimately revealing limitations in their additive inductive bias for modeling state coupling in oscillatory systems.

Enzo Nicolas Spotorno, Josafat Leal Filho, Antonio Augusto Medeiros Frohlich2026-03-06🔬 physics

Learning to Select Like Humans: Explainable Active Learning for Medical Imaging

This paper proposes an explainability-guided active learning framework that improves medical image analysis by strategically selecting samples based on both predictive uncertainty and attention misalignment with expert-defined regions, thereby achieving superior data efficiency and clinical interpretability compared to traditional methods.

Ifrat Ikhtear Uddin, Longwei Wang, Xiao Qin + 2 more2026-03-06💻 cs

Pailitao-VL: Unified Embedding and Reranker for Real-Time Multi-Modal Industrial Search

Pailitao-VL is a unified multi-modal retrieval system that achieves state-of-the-art, real-time industrial search performance by replacing traditional contrastive embeddings with an absolute ID-recognition paradigm and evolving reranking into a compare-and-calibrate listwise policy, thereby overcoming granularity, noise, and latency challenges in large-scale production environments.

Lei Chen, Chen Ju, Xu Chen + 13 more2026-03-06💻 cs

Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections

This paper introduces "Zombie Agents," a persistent black-box attack on self-evolving LLM agents that covertly implants payloads into long-term memory during benign sessions to survive across interactions and trigger unauthorized actions in future sessions, demonstrating that current per-session defenses are insufficient against such memory-based compromises.

Xianglin Yang, Yufei He, Shuo Ji, Bryan Hooi, Jin Song Dong2026-03-06🔒 cs.CR

SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework

SubQuad is an end-to-end pipeline that overcomes the computational and data imbalance bottlenecks in large-scale adaptive immune repertoire analysis by integrating near-subquadratic MinHash retrieval, GPU-accelerated affinity kernels, and fairness-constrained clustering to enable scalable, bias-aware discovery of clinically relevant clonotypes.

Rong Fu, Zijian Zhang, Kun Liu + 3 more2026-03-06💻 cs

Curriculum Learning for Efficient Chain-of-Thought Distillation via Structure-Aware Masking and GRPO

This paper proposes a three-stage curriculum learning framework that leverages structure-aware masking and Group Relative Policy Optimization (GRPO) to efficiently distill Chain-of-Thought reasoning into compact student models, achieving significant accuracy gains and output length reduction on GSM8K by progressively guiding the model from structural understanding to self-optimized brevity and targeted knowledge internalization.

Bowen Yu, Maolin Wang, Sheng Zhang + 7 more2026-03-06💻 cs

The Convergence of Schema-Guided Dialogue Systems and the Model Context Protocol

This paper argues that Schema-Guided Dialogue and the Model Context Protocol converge into a unified paradigm for deterministic LLM-agent interaction, proposing five foundational schema design principles to address critical gaps in failure handling and tool relationships while enabling scalable AI governance.

Andreas Schlapbach2026-03-06💻 cs

Give Users the Wheel: Towards Promptable Recommendation Paradigm

This paper proposes Decoupled Promptable Sequential Recommendation (DPR), a model-agnostic framework that enables conventional sequential recommenders to dynamically steer retrieval using natural language prompts by modulating latent user representations through a specialized fusion module, Mixture-of-Experts architecture, and a three-stage training strategy, thereby achieving superior performance in intent-driven tasks without sacrificing collaborative filtering efficiency.

Fuyuan Lyu, Chenglin Luo, Qiyuan Zhang + 6 more2026-03-06💻 cs

Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

This paper introduces a simulation-based clinical red teaming framework that pairs AI psychotherapists with dynamic patient agents to evaluate mental health support systems, revealing critical safety gaps such as the validation of delusions and failure to de-escalate suicide risk in AI agents tested against Alcohol Use Disorder scenarios.

Ian Steenstra, Paola Pedrelli, Weiyan Shi + 2 more2026-03-06💻 cs

On Imbalanced Regression with Hoeffding Trees

This paper extends kernel density estimation and hierarchical shrinkage to Hoeffding trees for imbalanced regression in data streams, demonstrating that kernel density estimation significantly improves early-stream performance while hierarchical shrinkage offers limited gains.

Pantia-Marina Alchirch, Dimitrios I. Diochnos2026-03-06💻 cs

Zatom-1: A Multimodal Flow Foundation Model for 3D Molecules and Materials

Zatom-1 is the first open-source, end-to-end foundation model that unifies generative and predictive learning for 3D molecules and materials using a multimodal flow matching objective, achieving state-of-the-art performance across domains while significantly reducing inference time and enabling positive transfer between chemical systems.

Alex Morehead, Miruna Cretu, Antonia Panescu + 14 more2026-03-06🔬 cond-mat.mtrl-sci

Interpretable Multimodal Gesture Recognition for Drone and Mobile Robot Teleoperation via Log-Likelihood Ratio Fusion

This paper proposes an interpretable, multimodal gesture recognition framework that fuses inertial and capacitive sensor data via log-likelihood ratio to enable robust, real-time, hands-free teleoperation of drones and mobile robots, supported by a new dataset and demonstrating performance comparable to vision-based methods with significantly lower computational costs.

Seungyeol Baek, Jaspreet Singh, Lala Shakti Swarup Ray + 3 more2026-03-06💻 cs

Pessimistic Auxiliary Policy for Offline Reinforcement Learning

This paper proposes a pessimistic auxiliary policy that samples reliable actions by maximizing the lower confidence bound of the Q-function, thereby mitigating out-of-distribution errors and improving the performance of offline reinforcement learning algorithms.

Fan Zhang, Baoru Huang, Xin Zhang2026-03-06💻 cs

Jailbreak Foundry: From Papers to Runnable Attacks for Reproducible Benchmarking

The paper introduces Jailbreak Foundry (JBF), a multi-agent system that automatically translates jailbreak research papers into executable modules within a unified harness, enabling rapid, reproducible, and standardized benchmarking of large language model security against rapidly evolving attack techniques.

Zhicheng Fang, Jingjie Zheng, Chenxu Fu, Wei Xu2026-03-06🔒 cs.CR

DiffusionHarmonizer: Bridging Neural Reconstruction and Photorealistic Simulation with Online Diffusion Enhancer

DiffusionHarmonizer is an online, single-step generative framework that leverages a custom data curation pipeline to transform imperfect neural reconstruction renderings into temporally consistent, photorealistic simulations, effectively resolving artifacts and harmonizing inserted dynamic objects for autonomous robot development.

Yuxuan Zhang, Katarína Tóthová, Zian Wang + 7 more2026-03-06💻 cs

← Previous Next →