cs papers | Gist.Science

Decomposing Physician Disagreement in HealthBench

This paper analyzes physician disagreement in the HealthBench dataset, revealing that while the majority of variance is structural and irreducible, a small but actionable portion stems from reducible uncertainties like missing context, suggesting that improving evaluation design to close information gaps could meaningfully reduce disagreement on borderline medical AI cases.

Satya Borgohain, Roy Mariathas2026-03-10💻 cs

WISER: Wider Search, Deeper Thinking, and Adaptive Fusion for Training-Free Zero-Shot Composed Image Retrieval

WISER is a training-free framework for Zero-Shot Composed Image Retrieval that unifies Text-to-Image and Image-to-Image paradigms through a "retrieve-verify-refine" pipeline, leveraging wider search, adaptive fusion, and self-reflection to significantly outperform existing methods across diverse benchmarks.

Tianyue Wang, Leigang Qu, Tianyu Yang, Xiangzhao Hao, Yifan Xu, Haiyun Guo, Jinqiao Wang2026-03-10💻 cs

PackUV: Packed Gaussian UV Maps for 4D Volumetric Video

The paper introduces PackUV, a novel 4D Gaussian representation and fitting method that maps volumetric video attributes into structured UV atlases for efficient, codec-compatible storage and streaming, while demonstrating superior temporal consistency and rendering fidelity on the newly proposed large-scale PackUV-2B dataset.

Aashish Rai, Angela Xing, Anushka Agarwal, Xiaoyan Cong, Zekun Li, Tao Lu, Aayush Prakash, Srinath Sridhar2026-03-10💻 cs

On Sample-Efficient Generalized Planning via Learned Transition Models

This paper proposes a sample-efficient approach to generalized planning that learns explicit neural transition models to predict intermediate world states, demonstrating superior out-of-distribution performance and data efficiency compared to direct action-sequence prediction methods.

Nitin Gupta, Vishal Pallagani, John A. Aydin, Biplav Srivastava2026-03-10💻 cs

Annotation-Free Visual Reasoning for High-Resolution Large Multimodal Models via Reinforcement Learning

This paper proposes HART, an annotation-free framework that leverages a novel Advantage Preference Group Relative Policy Optimization (AP-GRPO) algorithm to enable Large Multimodal Models to autonomously identify and verify key high-resolution image regions, thereby improving reasoning performance without requiring costly human grounding labels.

Jiacheng Yang, Anqi Chen, Yunkai Dang, Qi Fan, Cong Wang, Wenbin Li, Feng Miao, Yang Gao2026-03-10💻 cs

PEPA: a Persistently Autonomous Embodied Agent with Personalities

This paper introduces PEPA, a three-layer cognitive architecture that leverages personality traits to enable embodied agents to autonomously generate goals and sustain long-term operation in dynamic environments without relying on external task specifications.

Kaige Liu, Yang Li, Lijun Zhu, Weinan Zhang2026-03-10💻 cs

Self-Attention And Beyond the Infinite: Towards Linear Transformers with Infinite Self-Attention

This paper introduces Infinite Self-Attention (InfSA) and its linear-time variant, Linear-InfSA, a spectral reformulation of self-attention as a diffusion process on token graphs that achieves state-of-the-art ImageNet accuracy and enables efficient, memory-free inference at ultra-high resolutions (up to 9216×9216) by replacing the quadratic softmax cost with a Neumann series approximation.

Giorgio Roffo, Luke Palmer2026-03-10💻 cs

WildActor: Unconstrained Identity-Preserving Video Generation

This paper introduces WildActor, a framework for unconstrained identity-preserving human video generation that leverages the large-scale Actor-18M dataset and novel attention mechanisms to overcome existing limitations in maintaining consistent full-body identities across dynamic shots, viewpoints, and motions.

Qin Guo, Tianyu Yang, Xuanhua He, Fei Shen, Yong Zhang, Zhuoliang Kang, Xiaoming Wei, Dan Xu2026-03-10💻 cs

Position: Evaluation of Visual Processing Should Be Human-Centered, Not Metric-Centered

This position paper argues that the evaluation of modern visual processing systems must shift from a reliance on single-metric benchmarks toward a human-centered, context-aware paradigm to better align with human perception and foster genuine innovation.

Jinfan Hu, Fanghua Yu, Zhiyuan You, Xiang Yin, Hongyu An, Xinqi Lin, Chao Dong, Jinjin Gu2026-03-10💻 cs

Sustainable Care: Designing Technologies That Support Children's Long-Term Engagement with Social Issues

This workshop paper proposes "sustainable care" as a design framework to help researchers and practitioners create digital technologies that foster children's long-term, meaningful engagement with social issues while preventing empathic distress and burnout.

JaeWon Kim, Aayushi Dangol, Rotem Landesman, Alexis Hiniker, McKenna F. Parnes2026-03-10💻 cs

DeAR: Fine-Grained VLM Adaptation by Decomposing Attention Head Roles

The paper proposes DeAR, a fine-grained adaptation framework for Vision-Language Models that decomposes attention heads into functional roles (Attribute, Generalization, and Mixed) using a Concept Entropy metric to selectively isolate task-specific learning from generalization capabilities, thereby achieving superior performance across diverse tasks while preserving zero-shot robustness.

Yiming Ma, Hongkun Yang, Lionel Z. Wang, Bin Chen, Weizhi Xian, Jianzhi Teng2026-03-10💻 cs

Digital Twin-Based Cooling System Optimization for Data Center

This paper presents a validated digital twin of the Frontier supercomputer's liquid cooling system to demonstrate that a ramp-constrained, joint optimization of flow rate and supply temperature can achieve 27.8% energy savings, significantly outperforming flow-only strategies by addressing the gap between theoretical optima and operational deployability.

Shrenik Jadhav, Zheng Liu2026-03-10💻 cs

Extended Empirical Validation of the Explainability Solution Space

This technical report extends the empirical validation of the Explainability Solution Space (ESS) framework by demonstrating its domain-independent applicability and systematic adaptability to diverse governance roles and stakeholder configurations through a cross-domain evaluation involving both employee attrition and urban resource allocation systems.

Antoni Mestre, Manoli Albert, Miriam Gil, Vicente Pelechano2026-03-10💻 cs

Energy Efficient Traffic Scheduling For Optical LEO Satellite Downlinks

This paper proposes and evaluates static and adaptive traffic scheduling schemes—including threshold, heuristic, and reinforcement learning-based approaches—to optimize energy efficiency and delivery ratios for energy-constrained optical LEO satellite downlinks facing weather-related disruptions.

Ethan Fettes, Pablo G. Madoery, Halim Yanikomeroglu, Gunes Karabulut Kurt, Abhishek Naik, Stéphane Martel2026-03-10💻 cs

HarmonyCell: Automating Single-Cell Perturbation Modeling under Semantic and Distribution Shifts

HarmonyCell is an end-to-end agent framework that automates single-cell perturbation modeling by combining an LLM-driven semantic unifier to resolve metadata incompatibilities and an adaptive Monte Carlo Tree Search engine to synthesize architectures that handle distribution shifts, thereby achieving high execution success and outperforming expert baselines without manual engineering.

Wenxuan Huang, Mingyu Tsoi, Yanhao Huang, Xinjie Mao, Xue Xia, Hao Wu, Jiaqi Wei, Yuejin Yang, Lang Yu, Cheng Tan, Xiang Zhang, Zhangyang Gao, Siqi Sun2026-03-10💻 cs

LLM-assisted Semantic Option Discovery for Facilitating Adaptive Deep Reinforcement Learning

This paper proposes a novel LLM-driven closed-loop framework that maps natural language instructions to executable rules and semantically annotates options to enhance the data efficiency, interpretability, and cross-environment transferability of Deep Reinforcement Learning, with experimental validation showing superior performance in constraint compliance and skill reuse.

Chang Yao, Jinghui Qin, Kebing Jin, Hankz Hankui Zhuo2026-03-10💻 cs

MSP-ReID: Hairstyle-Robust Cloth-Changing Person Re-Identification

The paper proposes the MSP framework, which mitigates hairstyle distraction and preserves structural information through Hairstyle-Oriented Augmentation, Cloth-Preserved Random Erasing, and Region-based Parsing Attention to achieve state-of-the-art performance in cloth-changing person re-identification.

Xiangyang He, Lin Wan2026-03-10💻 cs

DINOv3 Visual Representations for Blueberry Perception Toward Robotic Harvesting

This paper evaluates DINOv3 as a frozen backbone for blueberry robotic harvesting tasks, finding that while it excels in segmentation through stable patch-level representations, its detection performance is limited by scale variation and spatial aggregation challenges, suggesting it functions best as a semantic backbone requiring downstream spatial modeling tailored to fruit structures.

Rui-Feng Wang, Daniel Petti, Yue Chen, Changying Li2026-03-10💻 cs

Event-Driven Safe and Resilient Control of Automated and Human-Driven Vehicles under EU-FDI Attacks

This paper proposes an event-driven safe and resilient control framework that integrates adaptive attack-resilient strategies with data-driven human driver estimation to ensure collision-free and stable lane-changing maneuvers for automated vehicles in mixed traffic under exponentially unbounded false data injection attacks.

Yi Zhang, Yichao Wang, Wei Xiao, Mohamadamin Rajabinezhad, Shan Zuo2026-03-10💻 cs

Generalized Per-Agent Advantage Estimation for Multi-Agent Policy Optimization

This paper proposes Generalized Per-Agent Advantage Estimation (GPAE), a novel multi-agent reinforcement learning framework that enhances sample efficiency and coordination by utilizing a per-agent value iteration operator and a double-truncated importance sampling scheme to enable stable off-policy learning without direct Q-function estimation.

Seongmin Kim, Giseung Park, Woojun Kim, Jiwon Jeon, Seungyul Han, Youngchul Sung2026-03-10💻 cs

← Previous Next →