UnSCAR: Universal, Scalable, Controllable, and Adaptable Image Restoration

The paper introduces UnSCAR, a scalable and controllable universal image restoration framework that utilizes a multi-branch mixture-of-experts architecture to overcome the limitations of catastrophic forgetting and performance degradation in existing all-in-one models when handling multiple real-world degradations.

Debabrata Mandal, Soumitri Chattopadhyay, Yujie Wang, Marc Niethammer, Praneeth Chakravarthula2026-03-10💻 cs

Generalization in Online Reinforcement Learning for Mobile Agents

This paper addresses the underexplored challenge of generalization in online reinforcement learning for mobile GUI agents by introducing the AndroidWorld-Generalization benchmark and a scalable GRPO-based training system, demonstrating that while RL significantly improves zero-shot performance on unseen task instances, generalization to new templates and applications remains difficult and benefits from test-time few-shot adaptation.

Li Gu, Zihuan Jiang, Zhixiang Chi, Huan Liu, Ziqiang Wang, Yuanhao Yu, Glen Berseth, Yang Wang2026-03-10🤖 cs.LG

DogWeave: High-Fidelity 3D Canine Reconstruction from a Single Image via Normal Fusion and Conditional Inpainting

DogWeave is a novel framework that reconstructs high-fidelity 3D canine models from a single RGB image by refining parametric meshes into detailed SDF representations via diffusion-enhanced normal optimization and generating view-consistent textures through conditional inpainting, thereby overcoming challenges like self-occlusion and fur detail to outperform existing state-of-the-art methods.

Shufan Sun, Chenchen Wang, Zongfu Yu2026-03-10💻 cs

Med-Evo: Test-time Self-evolution for Medical Multimodal Large Language Models

Med-Evo is a novel self-evolution framework for medical multimodal large language models that leverages label-free reinforcement learning, featuring Feature-driven Pseudo Labeling and Hard-Soft Reward mechanisms, to significantly enhance model performance on unlabeled test data without requiring additional annotated medical datasets.

Dunyuan Xu, Xikai Yang, Juzheng Miao, Yaoqian Li, Jinpeng Li, Pheng-Ann Heng2026-03-10💻 cs

SLNet: A Super-Lightweight Geometry-Adaptive Network for 3D Point Cloud Recognition

The paper introduces SLNet, a super-lightweight 3D point cloud recognition network utilizing Nonparametric Adaptive Point Embedding (NAPE) and Geometric Modulation Units (GMU) to achieve state-of-the-art accuracy on benchmarks like ModelNet40 and ScanObjectNN with significantly fewer parameters and computational costs compared to existing models.

Mohammad Saeid, Amir Salarpour, Pedram MohajerAnsari, Mert D. Pesé2026-03-10🤖 cs.LG

Selective Transfer Learning of Cross-Modality Distillation for Monocular 3D Object Detection

This paper introduces MonoSTL, a selective transfer learning framework that addresses the negative transfer caused by modality gaps in cross-modality distillation for monocular 3D object detection by employing similar architectures and novel depth-aware selective distillation modules to effectively transfer LiDAR depth information to image-based networks, achieving state-of-the-art performance on KITTI and NuScenes benchmarks.

Rui Ding, Meng Yang, Nanning Zheng2026-03-10💻 cs

Classifying Novel 3D-Printed Objects without Retraining: Towards Post-Production Automation in Additive Manufacturing

This paper introduces the ThingiPrint dataset and a contrastive fine-tuning approach that enables the classification of novel 3D-printed objects using their CAD models without requiring model retraining, thereby addressing a critical bottleneck in automating industrial post-production workflows.

Fanis Mathioulakis, Gorjan Radevski, Silke GC Cleuren, Michel Janssens, Brecht Das, Koen Schauwaert, Tinne Tuytelaars2026-03-10💻 cs

FedEU: Evidential Uncertainty-Driven Federated Fine-Tuning of Vision Foundation Models for Remote Sensing Image Segmentation

FedEU is a novel federated learning framework that enhances remote sensing image segmentation by integrating evidential uncertainty quantification and client-specific feature embeddings to guide adaptive global aggregation, thereby improving model robustness and reliability across heterogeneous distributed datasets.

Xiaokang Zhang, Xuran Xiong, Jianzhong Huang, Lefei Zhang2026-03-10💻 cs

RobustSCI: Beyond Reconstruction to Restoration for Snapshot Compressive Imaging under Real-World Degradations

This paper introduces RobustSCI, a pioneering framework that shifts snapshot compressive imaging from simple reconstruction to robust restoration by proposing a novel network architecture and a large-scale benchmark to effectively recover pristine scenes from real-world degraded measurements caused by motion blur and low light.

Hao Wang, Yuanfan Li, Qi Zhou, Zhankuo Xu, Jiong Ni, Xin Yuan2026-03-10💻 cs

RayD3D: Distilling Depth Knowledge Along the Ray for Robust Multi-View 3D Object Detection

The paper proposes RayD3D, a novel cross-modal distillation framework that transfers depth knowledge specifically along the camera-to-object ray to filter out irrelevant LiDAR information, thereby significantly enhancing the robustness of multi-view 3D object detection models against real-world data corruptions without increasing inference costs.

Rui Ding, Zhaonian Kuang, Zongwei Zhou, Meng Yang, Xinhu Zheng, Gang Hua2026-03-10💻 cs

DocCogito: Aligning Layout Cognition and Step-Level Grounded Reasoning for Document Understanding

DocCogito is a unified framework for document understanding that aligns global layout perception with structured, region-grounded reasoning through a lightweight layout tower and a deterministic Visual-Semantic Chain, achieving state-of-the-art performance on multiple benchmarks by enforcing systematic coupling between layout priors and evidence-based reasoning.

Yuchuan Wu, Minghan Zhuo, Teng Fu, Mengyang Zhao, Bin Li, Xiangyang Xue2026-03-10💻 cs