Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery

This paper proposes a reparameterized Tensor Ring functional decomposition that leverages Implicit Neural Representations and a structured basis combination to overcome the high-frequency modeling limitations of traditional methods, achieving superior performance in multi-dimensional data recovery tasks such as image inpainting and point cloud reconstruction.

Yangyang Xu, Junbo Ke, You-Wei Wen, Chao Wang2026-03-09🤖 cs.AI

Think-as-You-See: Streaming Chain-of-Thought Reasoning for Large Vision-Language Models

This paper introduces Think-as-You-See (TaYS), a unified framework that enables concurrent, streaming Chain-of-Thought reasoning for Large Vision-Language Models by decoupling visual encoding from textual reasoning, thereby outperforming traditional batch and interleaved approaches in both accuracy and latency for real-time video understanding.

Jialiang Zhang, Junlong Tong, Junyan Lin, Hao Wu, Yirong Sun, Yunpu Ma, Xiaoyu Shen2026-03-09💻 cs

Omni-C: Compressing Heterogeneous Modalities into a Single Dense Encoder

The paper introduces Omni-C, a single dense Transformer encoder that compresses heterogeneous modalities (text, audio, and image) into shared representations via unimodal contrastive pretraining, thereby eliminating the parameter overhead and routing complexity of Mixture-of-Expert architectures while achieving comparable performance with significantly reduced memory usage.

Kin Wai Lau, Yasar Abbas Ur Rehman, Lai-Man Po, Pedro Porto Buarque de Gusmão2026-03-09🤖 cs.AI

Clinical-Injection Transformer with Domain-Adapted MAE for Lupus Nephritis Prognosis Prediction

This paper proposes a novel multimodal framework, the Clinical-Injection Transformer with a domain-adapted MAE, which integrates routine PAS-stained histopathology images and clinical data to achieve high-accuracy three-class prognosis prediction for pediatric lupus nephritis, addressing previous limitations in data availability and modality integration.

Yuewen Huang, Zhitao Ye, Guangnan Feng, Fudan Zheng, Xia Gao, Yutong Lu2026-03-09🤖 cs.LG

Digital-Twin Losses for Lane-Compliant Trajectory Prediction at Urban Intersections

This paper presents a digital twin-driven V2X trajectory prediction framework for urban intersections that employs a novel twin loss function alongside standard MSE to enforce traffic rules, collision avoidance, and motion diversity, thereby significantly reducing safety violations while maintaining high prediction accuracy and real-time performance.

Kuo-Yi Chao, Erik Leo Haß, Melina Gegg, Jiajie Zhang, Ralph Raßhofer, Alois Christian Knoll2026-03-09💻 cs

DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces

DreamCAD is a multi-modal generative framework that enables scalable, high-fidelity CAD generation by representing editable BReps as differentiable parametric surfaces for training on unannotated 3D meshes, while also introducing the large-scale CADCap-1M dataset to advance text-to-CAD research.

Mohammad Sadil Khan, Muhammad Usama, Rolandos Alexandros Potamias, Didier Stricker, Muhammad Zeshan Afzal, Jiankang Deng, Ismail Elezi2026-03-09🤖 cs.AI

Adversarial Batch Representation Augmentation for Batch Correction in High-Content Cellular Screening

This paper proposes Adversarial Batch Representation Augmentation (ABRA), a domain generalization framework that synthesizes worst-case bio-batch perturbations via structured uncertainty modeling and angular geometric margins to achieve state-of-the-art batch correction and generalization in high-content cellular screening without relying on additional prior knowledge.

Lei Tong, Xujing Yao, Adam Corrigan, Long Chen, Navin Rathna Kumar, Kerry Hallbrook, Jonathan Orme, Yinhai Wang, Huiyu Zhou2026-03-09🤖 cs.AI

Post Fusion Bird's Eye View Feature Stabilization for Robust Multimodal 3D Detection

This paper introduces the Post Fusion Stabilizer (PFS), a lightweight, plug-and-play module that enhances the robustness of existing camera-LiDAR fusion 3D detectors against domain shifts and sensor failures by stabilizing bird's-eye view feature statistics and adaptively correcting degraded cues without requiring architectural changes or retraining.

Trung Tien Dong, Dev Thakkar, Arman Sargolzaei, Xiaomin Lin2026-03-09🤖 cs.AI

Making Reconstruction FID Predictive of Diffusion Generation FID

This paper introduces interpolated FID (iFID), a novel metric that achieves a strong correlation with diffusion generation FID by interpolating latent representations between dataset samples and their nearest neighbors, thereby overcoming the limitations of traditional reconstruction FID in predicting generative model quality.

Tongda Xu, Mingwei He, Shady Abu-Hussein, Jose Miguel Hernandez-Lobato, Haotian Zhang, Kai Zhao, Chao Zhou, Ya-Qin Zhang, Yan Wang2026-03-09🤖 cs.LG

When Rubrics Fail: Error Enumeration as Reward in Reference-Free RL Post-Training for Virtual Try-On

This paper introduces Implicit Error Counting (IEC), a reference-free reinforcement learning post-training method that enumerates and weights errors to generate rewards, demonstrating superior performance over Rubrics as Rewards (RaR) in virtual try-on tasks where multiple valid outputs exist and ideal reference answers are unavailable.

Wisdom Ikezogwo, Mehmet Saygin Seyfioglu, Ranjay Krishna, Karim Bouyarmane2026-03-09🤖 cs.AI