ColaVLA: Leveraging Cognitive Latent Reasoning for Hierarchical Parallel Trajectory Planning in Autonomous Driving

ColaVLA is a unified vision-language-action framework that addresses the latency and modality mismatch of existing VLM-based planners by transferring cognitive reasoning into a compact latent space and employing a hierarchical parallel decoder to achieve state-of-the-art, efficient, and safe trajectory planning on the nuScenes benchmark.

Qihang Peng, Xuesong Chen, Chenye Yang + 2 more2026-03-02💻 cs

CPiRi: Channel Permutation-Invariant Relational Interaction for Multivariate Time Series Forecasting

CPiRi is a novel framework for multivariate time series forecasting that combines a spatio-temporal decoupling architecture with permutation-invariant regularization to overcome the limitations of existing channel-dependent and independent models, achieving state-of-the-art performance, robustness to channel reordering, and strong inductive generalization to unseen channels.

Jiyuan Xu, Wenyu Zhang, Xin Jing + 3 more2026-03-02💻 cs

One2Scene: Geometric Consistent Explorable 3D Scene Generation from a Single Image

One2Scene is a novel framework that generates geometrically consistent, explorable 3D scenes from a single image by decomposing the task into panorama generation, 3D scaffold construction via multi-view stereo matching on sparse anchor views, and novel view synthesis, thereby overcoming the severe distortions and artifacts common in existing methods during large camera motions.

Pengfei Wang, Liyi Chen, Zhiyuan Ma + 3 more2026-03-02💻 cs