cs.AI papers | Gist.Science

Rescaling Confidence: What Scale Design Reveals About LLM Metacognition

This paper demonstrates that the design of confidence scales significantly impacts LLM metacognition, revealing that standard 0–100 formats induce heavy discretization while 0–20 scales consistently improve metacognitive efficiency.

Yuyang Dai2026-03-11🤖 cs.AI

Curveball Steering: The Right Direction To Steer Isn't Always Linear

This paper challenges the Linear Representation Hypothesis by demonstrating that LLM activation spaces exhibit significant geometric distortion, leading to the proposal of "Curveball steering," a nonlinear intervention method using polynomial kernel PCA that outperforms traditional linear approaches by better respecting the intrinsic geometry of the model's feature space.

Shivam Raval, Hae Jin Song, Linlin Wu, Abir Harrasse, Jeff Phillips, Amirali Abdullah2026-03-11🤖 cs.AI

CLoE: Expert Consistency Learning for Missing Modality Segmentation

The paper proposes CLoE, a consistency-driven framework that enhances missing-modality medical image segmentation by enforcing decision-level agreement among modality experts on both global and clinically critical foreground regions, thereby improving robustness and generalization compared to state-of-the-art methods.

Xinyu Tong, Meihua Zhou, Bowu Fan, Haitao Li2026-03-11🤖 cs.AI

SpaceSense-Bench: A Large-Scale Multi-Modal Benchmark for Spacecraft Perception and Pose Estimation

This paper introduces SpaceSense-Bench, a large-scale, multi-modal benchmark generated via high-fidelity Unreal Engine 5 simulations that provides 136 diverse satellite models with synchronized RGB, depth, and LiDAR data alongside dense semantic and pose annotations to address the scarcity of real-world space data and demonstrate the critical importance of dataset scale and diversity for advancing spacecraft perception and pose estimation.

Aodi Wu, Jianhong Zuo, Zeyuan Zhao, Xubo Luo, Ruisuo Wang, Xue Wan2026-03-11🤖 cs.AI

Reading the Mood Behind Words: Integrating Prosody-Derived Emotional Context into Socially Responsive VR Agents

This paper proposes and validates an emotion-context-aware VR interaction pipeline that integrates real-time prosody-derived emotional states into LLM-based agent dialogue, significantly improving user engagement, naturalness, and rapport compared to traditional text-only approaches.

SangYeop Jeong, Yeongseo Na, Seung Gyu Jeong, Jin-Woo Jeong, Seong-Eun Kim2026-03-11🤖 cs.AI

TimberAgent: Gram-Guided Retrieval for Executable Music Effect Control

This paper introduces TimberAgent, a Gram-guided retrieval system using Texture Resonance Retrieval (TRR) to bridge the semantic gap between user intent and low-level audio effect parameters, demonstrating superior performance in predicting editable plugin configurations through rigorous benchmarking and perceptual evaluation.

Shihao He, Yihan Xia, Fang Liu, Taotao Wang, Shengli Zhang2026-03-11🤖 cs.AI

Beyond Scaling: Assessing Strategic Reasoning and Rapid Decision-Making Capability of LLMs in Zero-sum Environments

This paper introduces the Strategic Tactical Agent Reasoning (STAR) benchmark, a multi-agent framework for evaluating LLMs in zero-sum environments, which reveals a critical trade-off where reasoning-intensive models excel in turn-based settings but often underperform in real-time scenarios due to latency, highlighting the need to balance strategic depth with rapid execution.

Yang Li, Xing Chen, Yutao Liu, Gege Qi, Yanxian BI, Zizhe Wang, Yunjian Zhang, Yao Zhu2026-03-11🤖 cs.AI

TaSR-RAG: Taxonomy-guided Structured Reasoning for Retrieval-Augmented Generation

TaSR-RAG is a taxonomy-guided framework that enhances Retrieval-Augmented Generation for multi-hop reasoning by decomposing complex queries into structured triple sub-queries and performing step-wise evidence selection through hybrid matching, thereby achieving superior accuracy and clearer reasoning traces without relying on costly graph construction.

Jiashuo Sun, Yixuan Xie, Jimeng Shi, Shaowen Wang, Jiawei Han2026-03-11🤖 cs.AI

Robust Regularized Policy Iteration under Transition Uncertainty

This paper introduces Robust Regularized Policy Iteration (RRPI), a novel offline reinforcement learning framework that unifies policy-induced extrapolation and transition uncertainty by formulating robust policy optimization with a tractable KL-regularized surrogate, offering theoretical convergence guarantees and demonstrating superior performance and robustness on D4RL benchmarks.

Hongqiang Lin, Zhenghui Fu, Weihao Tang, Pengfei Wang, Yiding Sun, Qixian Huang, Dongxu Zhang2026-03-11🤖 cs.AI

TA-GGAD: Testing-time Adaptive Graph Model for Generalist Graph Anomaly Detection

This paper introduces TA-GGAD, a testing-time adaptive graph foundation model that addresses the cross-domain generalization challenge in anomaly detection by identifying and modeling the "Anomaly Disassortativity" issue, thereby achieving state-of-the-art performance across diverse real-world graphs with a single training phase.

Xiong Zhang, Hong Peng, Changlong Fu, Xin Jin, Yun Yang, Cheng Xie2026-03-11🤖 cs.AI

Democratising Clinical AI through Dataset Condensation for Classical Clinical Models

This paper introduces a differentially private, zero-order optimization framework that extends dataset condensation to non-differentiable clinical models, enabling the creation of compact, privacy-preserving synthetic datasets that facilitate the democratization of clinical data sharing without compromising model utility.

Anshul Thakur, Soheila Molaei, Pafue Christy Nganjimi, Joshua Fieggen, Andrew A. S. Soltan, Danielle Belgrave, Lei Clifton, David A. Clifton2026-03-11🤖 cs.AI

M3GCLR: Multi-View Mini-Max Infinite Skeleton-Data Game Contrastive Learning For Skeleton-Based Action Recognition

This paper proposes M3GCLR, a game-theoretic contrastive learning framework that addresses limitations in existing skeleton-based action recognition methods by establishing an Infinite Skeleton-data Game model with a mini-max optimization strategy and dual-loss equilibrium optimizer to effectively handle view discrepancies, adversarial mechanisms, and augmentation perturbations, achieving state-of-the-art performance on multiple benchmarks.

Yanshan Li, Ke Ma, Miaomiao Wei, Linhui Dai2026-03-11🤖 cs.AI

MIL-PF: Multiple Instance Learning on Precomputed Features for Mammography Classification

The paper proposes MIL-PF, a scalable framework that combines frozen foundation model encoders with a lightweight attention-based Multiple Instance Learning head to achieve state-of-the-art mammography classification while significantly reducing computational costs and training complexity.

Nikola Jovišic, Milica Škipina, Nicola Dall'Asen, Dubravko Culibrk2026-03-11🤖 cs.AI

SPAARS: Safer RL Policy Alignment through Abstract Exploration and Refined Exploitation of Action Space

SPAARS is a curriculum learning framework for offline-to-online reinforcement learning that safely improves policies by initially exploring a low-dimensional latent space to ensure sample efficiency and stability, then seamlessly transitioning to raw action space to bypass decoder-induced performance ceilings, thereby achieving superior results over state-of-the-art baselines on both robotic manipulation and locomotion tasks.

Swaminathan S K, Aritra Hazra2026-03-11🤖 cs.AI

Physics-Informed Neural Engine Sound Modeling with Differentiable Pulse-Train Synthesis

This paper introduces the Pulse-Train-Resonator (PTR), a differentiable synthesis model that improves engine sound generation by directly modeling sequential exhaust pressure pulses and physical resonances rather than approximating spectral characteristics, achieving superior reconstruction accuracy and interpretability across diverse engine types.

Robin Doerfler, Lonce Wyse2026-03-11🤖 cs.AI

ICDAR 2025 Competition on End-to-End Document Image Machine Translation Towards Complex Layouts

This paper presents the ICDAR 2025 competition on end-to-end document image machine translation, detailing its dual-track structure for small and large models, participation statistics, and findings that highlight large-model approaches as a promising paradigm for handling complex document layouts.

Yaping Zhang, Yupu Liang, Zhiyang Zhang, Zhiyuan Chen, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong2026-03-11🤖 cs.AI

Reviving ConvNeXt for Efficient Convolutional Diffusion Models

This paper introduces the Fully Convolutional Diffusion Model (FCDM), a ConvNeXt-based architecture that achieves competitive generative performance with significantly fewer computational resources and training steps than Transformer-based counterparts, demonstrating that modern convolutional designs remain a highly efficient alternative for scaling diffusion models.

Taesung Kwon, Lorenzo Bianchi, Lennart Wittke, Felix Watine, Fabio Carrara, Jong Chul Ye, Romann Weber, Vinicius Azevedo2026-03-11🤖 cs.AI

PromptDLA: A Domain-aware Prompt Document Layout Analysis Framework with Descriptive Knowledge as a Cue

This paper introduces PromptDLA, a domain-aware framework that leverages descriptive knowledge as cues to customize prompts for integrating domain priors, thereby overcoming the limitations of directly merging diverse datasets and achieving state-of-the-art performance in Document Layout Analysis across multiple benchmarks.

Zirui Zhang, Yaping Zhang, Lu Xiang, Yang Zhao, Feifei Zhai, Yu Zhou, Chengqing Zong2026-03-11🤖 cs.AI

From Flow to One Step: Real-Time Multi-Modal Trajectory Policies via Implicit Maximum Likelihood Estimation-based Distribution Distillation

This paper proposes a real-time multi-modal trajectory policy framework that distills a Conditional Flow Matching expert into a single-step student using Implicit Maximum Likelihood Estimation and a bi-directional Chamfer distance, thereby eliminating the latency of iterative ODE integration while preserving multi-modal action diversity for high-frequency robotic control.

Ju Dong, Liding Zhang, Lei Zhang, Yu Fu, Kaixin Bai, Zoltan-Csaba Marton, Zhenshan Bing, Zhaopeng Chen, Alois Christian Knoll, Jianwei Zhang2026-03-11🤖 cs.AI

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health

This study investigates gender bias in Large Language Models within French healthcare contexts, demonstrating that these models rely on embedded stereotypes when processing interactions between gender and other social determinants of health, thereby highlighting the need for context-specific assessments that go beyond evaluating individual factors in isolation.

Trung Hieu Ngo, Adrien Bazoge, Solen Quiniou, Pierre-Antoine Gourraud, Emmanuel Morin2026-03-11🤖 cs.AI

← Previous Next →