GP-Tree: An in-memory spatial index combining adaptive grid cells with a prefix tree for efficient spatial querying

The paper proposes GP-Tree, a novel in-memory spatial index that combines adaptive grid cells with a prefix tree structure to replace coarse minimum bounding rectangles with fine-grained approximations, thereby significantly improving filtering accuracy and query performance for complex spatial objects compared to traditional indexes.

Xiangyang Yang, Xuefeng Guan, Lanxue Dang, Yi Xie, Qingyang Xu, Huayi Wu, Jiayao Wang2026-03-10💻 cs

On the Effectiveness of Code Representation in Deep Learning-Based Automated Patch Correctness Assessment

This paper presents the first extensive study evaluating over 500 models to demonstrate that graph-based code representations consistently outperform other methods in predicting patch correctness, thereby significantly improving the effectiveness of automated program repair tools.

Quanjun Zhang, Chunrong Fang, Haichuan Hu, Yuan Zhao, Weisong Sun, Yun Yang, Tao Zheng, Zhenyu Chen2026-03-10💻 cs

SketchGraphNet: A Memory-Efficient Hybrid Graph Transformer for Large-Scale Sketch Corpora Recognition

This paper introduces SketchGraphNet, a memory-efficient hybrid graph transformer that models free-hand sketches as structured graphs to achieve state-of-the-art recognition accuracy on the newly constructed 3.44-million-sample SketchGraph benchmark while significantly reducing computational resource requirements.

Shilong Chen, Mingyuan Li, Zhaoyang Wang, Zhonglin Ye, Haixing Zhao2026-03-10💻 cs

ICLR: In-Context Imitation Learning with Visual Reasoning

The paper presents ICLR, a novel framework that enhances in-context imitation learning for robots by augmenting demonstration prompts with structured visual reasoning traces and jointly training a unified autoregressive transformer to predict both future trajectories and actions, thereby improving success rates and generalization in complex manipulation tasks.

Toan Nguyen, Weiduo Yuan, Songlin Wei, Hui Li, Daniel Seita, Yue Wang2026-03-10💻 cs

Scale-Aware UAV-to-Satellite Cross-View Geo-Localization: A Semantic Geometric Approach

This paper proposes a semantic geometric framework that leverages small vehicles as metric anchors within a decoupled stereoscopic projection model to recover absolute scale from monocular UAV images, thereby enabling scale-adaptive satellite image cropping and significantly improving cross-view geo-localization robustness under real-world scale ambiguity.

Yibin Ye, Shuo Chen, Kun Wang, Xiaokai Song, Jisheng Dang, Qifeng Yu, Xichao Teng, Zhang Li2026-03-10💻 cs

How Long Can Unified Multimodal Models Generate Images Reliably? Taming Long-Horizon Interleaved Image Generation via Context Curation

This paper introduces UniLongGen, a training-free inference strategy that improves long-horizon interleaved image generation by dynamically curating context to discard accumulated visual noise, thereby overcoming the reliability collapse caused by dense visual token interference in unified multimodal models.

Haoyu Chen, Qing Liu, Yuqian Zhou, He Zhang, Zhaowen Wang, Mengwei Ren, Jingjing Ren, Xiang Wang, Zhe Lin, Lei Zhu2026-03-10💻 cs

CONSTANT: Towards High-Quality One-Shot Handwriting Generation with Patch Contrastive Enhancement and Style-Aware Quantization

The paper introduces CONSTANT, a novel one-shot handwriting generation framework that leverages Style-Aware Quantization and a latent patch-based contrastive objective within a diffusion model to overcome existing limitations in capturing diverse writer styles and generating high-quality, realistic handwritten images across multiple languages.

Anh-Duy Le, Van-Linh Pham, Thanh-Nam Vo, Xuan Toan Mai, Tuan-Anh Tran2026-03-10💻 cs

Evaluating Parkinson's Disease Detection in Anonymized Speech: A Performance and Acoustic Analysis

This paper evaluates the trade-off between privacy and Parkinson's disease detection in anonymized speech, demonstrating that while STT-TTS anonymization severely degrades diagnostic performance by erasing prosodic cues, kNN-VC effectively preserves macro-prosodic features to maintain high detection accuracy with only a minor performance drop.

Carlos Franzreb, Francisco Teixeira, Ben Luks, Sebastian Möller, Alberto Abad2026-03-10💻 cs

Targeted Speaker Poisoning Framework in Zero-Shot Text-to-Speech

This paper introduces a novel Speech Generation Speaker Poisoning (SGSP) framework to address privacy risks in zero-shot text-to-speech by modifying trained models to prevent the generation of specific speaker identities while maintaining utility for others, demonstrating effective protection for up to 15 speakers but revealing scalability challenges with larger sets due to identity overlap.

Thanapat Trachu, Thanathai Lertpetchpun, Sai Praneeth Karimireddy, Shrikanth Narayanan2026-03-10💻 cs

ReconDrive: Fast Feed-Forward 4D Gaussian Splatting for Autonomous Driving Scene Reconstruction

ReconDrive is a fast, feed-forward framework that adapts the VGGT foundation model with hybrid prediction heads and static-dynamic composition to achieve high-fidelity, scalable 4D Gaussian Splatting for autonomous driving scenes, outperforming existing feed-forward methods while matching the quality of slower optimization-based approaches.

Haibao Yu, Kuntao Xiao, Jiahang Wang, Ruiyang Hao, Yuxin Huang, Guoran Hu, Haifang Qin, Bowen Jing, Yuntian Bo, Ping Luo2026-03-10💻 cs

AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents

This paper introduces AgentRaft, an automated framework that combines program analysis and semantic reasoning to detect and quantify the systemic risk of Data Over-Exposure in LLM agents, demonstrating high accuracy and efficiency across thousands of real-world tools.

Yixi Lin (Sun Yat-sen University, Zhuhai, Guangdong, China), Jiangrong Wu (Sun Yat-sen University, Zhuhai, Guangdong, China), Yuhong Nan (Sun Yat-sen University, Zhuhai, Guangdong, China), Xueqiang Wang (University of Central Florida, Orlando, Florida, USA), Xinyuan Zhang (Sun Yat-sen University, Zhuhai, Guangdong, China), Zibin Zheng (Sun Yat-sen University, Zhuhai, Guangdong, China)2026-03-10💻 cs

Active Inference for Micro-Gesture Recognition: EFE-Guided Temporal Sampling and Adaptive Learning

This paper proposes an active inference-based framework for micro-gesture recognition that utilizes Expected Free Energy-guided temporal sampling and uncertainty-aware adaptive learning to overcome challenges like low amplitude, noise, and inter-subject variability, demonstrating significant performance improvements on the SMG dataset.

Weijia Feng, Jingyu Yang, Ruojia Zhang, Fengtao Sun, Qian Gao, Chenyang Wang, Tongtong Su, Jia Guo, Xiaobai Li, Minglai Shao2026-03-10💻 cs

SiamGM: Siamese Geometry-Aware and Motion-Guided Network for Real-Time Satellite Video Object Tracking

The paper proposes SiamGM, a real-time Siamese network for satellite video object tracking that integrates a geometry-aware Inter-Frame Graph Attention module and a motion-guided optimization strategy to effectively address challenges like small targets and occlusions while achieving 130 FPS without computational overhead.

Zixiao Wen, Zhen Yang, Jiawei Li, Xiantai Xiang, Guangyao Zhou, Yuxin Hu, Yuhan Liu2026-03-10💻 cs

Efficient RGB-D Scene Understanding via Multi-task Adaptive Learning and Cross-dimensional Feature Guidance

This paper proposes an efficient multi-task RGB-D scene understanding model that integrates an enhanced fusion encoder, specialized feature interaction layers, and a dynamic adaptive loss function to simultaneously perform semantic, instance, and panoptic segmentation, orientation estimation, and scene classification with improved accuracy and speed across multiple datasets.

Guodong Sun, Junjie Liu, Gaoyang Zhang, Bo Wu, Yang Zhang2026-03-10💻 cs