The Texture-Shape Dilemma: Boundary-Safe Synthetic Generation for 3D Medical Transformers

This paper addresses the limitations of existing formula-driven synthetic data by proposing a Physics-inspired Spatially-Decoupled Synthesis framework that resolves the texture-shape conflict through a gradient-shielded buffer zone and spectral texture injection, thereby significantly enhancing the performance of 3D medical Vision Transformers on BTCV and MSD datasets without relying on real patient data.

Jiaqi Tang, Weixuan Xu, Shu Zhang + 2 more2026-03-03💻 cs

RaUF: Learning the Spatial Uncertainty Field of Radar

This paper proposes RaUF, a spatial uncertainty field learning framework that addresses the low fidelity and ambiguity of millimeter-wave radar by modeling anisotropic probabilistic uncertainty and employing a bidirectional domain attention mechanism to suppress spurious returns, thereby delivering highly reliable spatial detections with well-calibrated uncertainty for downstream perception tasks.

Shengpeng Wang, Kuangyu Wang, Wei Wang2026-03-03💻 cs

SMR-Net:Robot Snap Detection Based on Multi-Scale Features and Self-Attention Network

To address the limitations of traditional visual methods in robot automated assembly, this paper proposes SMR-Net, a self-attention-based multi-scale detection algorithm paired with a dedicated sensor, which significantly improves snap localization precision and robustness in complex scenarios by integrating attention-enhanced feature extraction, parallel multi-scale processing, and adaptive reweighting.

Kuanxu Hou2026-03-03💻 cs

SHIELD8-UAV: Sequential 8-bit Hardware Implementation of a Precision-Aware 1D-F-CNN for Low-Energy UAV Acoustic Detection and Temporal Tracking

This paper presents SHIELD8-UAV, a low-energy, sequential 8-bit hardware accelerator for UAV acoustic detection that achieves real-time, precision-aware inference on resource-constrained edge devices through a shared multi-precision datapath, layer-sensitivity quantization, and structured channel pruning.

Susmita Ghanta, Karan Nathwani, Rohit Chaurasiya2026-03-03⚡ eess

Unified Vision-Language Modeling via Concept Space Alignment

This paper introduces V-SONAR, a unified vision-language embedding space aligned with the multilingual SONAR text space, and leverages it to develop V-LCM, a model that achieves state-of-the-art performance in video captioning and significantly outperforms existing vision-language models across 61 diverse languages through concept space alignment and latent diffusion training.

Yifu Qiu, Paul-Ambroise Duquenne, Holger Schwenk2026-03-03💬 cs.CL