VideoPulse: Neonatal heart rate and peripheral capillary oxygen saturation (SpO2) estimation from contact free video

The paper introduces VideoPulse, a comprehensive dataset and end-to-end deep learning pipeline that enables accurate, contact-free estimation of neonatal heart rate and SpO2 from facial video, offering a low-cost, non-invasive alternative to traditional adhesive monitoring methods in intensive care settings.

Deependra Dewagiri, Kamesh Anuradha, Pabadhi Liyanage + 6 more2026-03-02⚡ eess

Breaking the Data Barrier: Robust Few-Shot 3D Vessel Segmentation using Foundation Models

This paper proposes a novel few-shot 3D vessel segmentation framework that adapts the pre-trained DINOv3 foundation model with specialized 3D components to achieve superior performance and robustness in data-scarce and out-of-distribution clinical scenarios, significantly outperforming state-of-the-art methods like nnU-Net with only five training samples.

Kirato Yoshihara, Yohei Sugawara, Yuta Tokuoka + 1 more2026-03-02⚡ eess

See, Act, Adapt: Active Perception for Unsupervised Cross-Domain Visual Adaptation via Personalized VLM-Guided Agent

The paper proposes Sea2^2, an unsupervised cross-domain adaptation framework that employs a VLM-guided agent to actively navigate and select optimal viewpoints for frozen perception models, thereby significantly improving performance on tasks like visual grounding, segmentation, and 3D box estimation without requiring downstream labels or model retraining.

Tianci Tang, Tielong Cai, Hongwei Wang + 1 more2026-03-02🤖 cs.AI

Revisiting Integration of Image and Metadata for DICOM Series Classification: Cross-Attention and Dictionary Learning

This paper proposes a robust end-to-end multimodal framework for DICOM series classification that leverages bi-directional cross-attention and a sparse, missingness-aware dictionary learning encoder to effectively handle heterogeneous image content, variable series lengths, and incomplete metadata without requiring imputation, thereby outperforming existing baselines in both in-domain and out-of-domain settings.

Tuan Truong, Melanie Dohmen, Sara Lorio + 1 more2026-03-02⚡ eess

Polarization Uncertainty-Guided Diffusion Model for Color Polarization Image Demosaicking

This paper proposes a Polarization Uncertainty-Guided Diffusion Model that leverages image diffusion priors and explicitly models polarization uncertainty to accurately reconstruct high-fidelity color polarization images, effectively overcoming the limitations of existing network-based methods in recovering polarization characteristics due to data scarcity.

Chenggong Li, Yidong Luo, Junchao Zhang + 1 more2026-03-02⚡ eess

Open-Vocabulary Semantic Segmentation in Remote Sensing via Hierarchical Attention Masking and Model Composition

This paper introduces ReSeg-CLIP, a training-free open-vocabulary semantic segmentation method for remote sensing that achieves state-of-the-art performance by combining hierarchical attention masking with SAM-generated masks and a novel model composition strategy that averages multiple RS-specific CLIP variants.

Mohammadreza Heidarianbaei, Mareike Dorozynski, Hubert Kanyamahanga + 2 more2026-03-02💻 cs

Bandwidth-adaptive Cloud-Assisted 360-Degree 3D Perception for Autonomous Vehicles

This paper proposes a bandwidth-adaptive, cloud-assisted framework for autonomous vehicles that dynamically splits transformer-based 360-degree 3D perception tasks between the vehicle and the cloud using feature compression and quantization, achieving a 72% latency reduction and up to 20% accuracy improvement over static methods under fluctuating network conditions.

Faisal Hawladera, Rui Meireles, Gamal Elghazaly + 2 more2026-03-02🤖 cs.LG