Egocentric Co-Pilot: Web-Native Smart-Glasses Agents for Assistive Egocentric AI

This paper introduces Egocentric Co-Pilot, a web-native neuro-symbolic framework for smart glasses that combines an LLM-orchestrated toolset with advanced temporal reasoning and multimodal intent mapping to deliver state-of-the-art, always-on assistive AI for navigation and daily tasks, demonstrating superior performance and user satisfaction over commercial baselines through both cloud and local deployment evaluations.

Sicheng Yang, Yukai Huang, Weitong Cai + 8 more2026-03-03🤖 cs.AI

GroundedSurg: A Multi-Procedure Benchmark for Language-Conditioned Surgical Tool Segmentation

This paper introduces GroundedSurg, the first multi-procedure benchmark designed to evaluate language-conditioned, instance-level surgical tool segmentation by pairing surgical images with natural language descriptions and precise spatial annotations to address the limitations of existing category-level evaluation paradigms in clinical AI.

Tajamul Ashraf, Abrar Ul Riyaz, Wasif Tak + 4 more2026-03-03💻 cs

Teacher-Guided Causal Interventions for Image Denoising: Orthogonal Content-Noise Disentanglement in Vision Transformers

The paper proposes TCD-Net, a Vision Transformer-based image denoising framework that utilizes teacher-guided causal interventions, including environmental bias adjustment and orthogonal content-noise disentanglement, to eliminate spurious correlations and achieve state-of-the-art fidelity and real-time performance.

Kuai Jiang, Zhaoyan Ding, Guijuan Zhang + 2 more2026-03-03💻 cs

TC-SSA: Token Compression via Semantic Slot Aggregation for Gigapixel Pathology Reasoning

This paper proposes TC-SSA, a learnable token compression framework that utilizes gated semantic slot aggregation to efficiently process gigapixel whole slide images by reducing visual tokens to 1.7% of the original sequence while preserving diagnostically critical information and outperforming existing sampling-based methods in both reasoning and classification tasks.

Zhuo Chen, Shawn Young, Lijian Xu2026-03-03🤖 cs.AI

GRAD-Former: Gated Robust Attention-based Differential Transformer for Change Detection

GRAD-Former is a novel, parameter-efficient framework for remote sensing change detection that utilizes a gated robust attention mechanism with Adaptive Feature Relevance and Refinement to overcome the limitations of existing models in handling high-resolution imagery and limited training data, achieving state-of-the-art performance across multiple datasets.

Durgesh Ameta, Ujjwal Mishra, Praful Hambarde + 1 more2026-03-03🤖 cs.AI