Streaming Drag-Oriented Interactive Video Manipulation: Drag Anything, Anytime!

The paper introduces REVEL, a new task for streaming, fine-grained interactive video manipulation on any object at any time, and proposes DragStream, a training-free method that resolves latent distribution drift and context interference in autoregressive video diffusion models through adaptive distribution self-rectification and spatial-frequency selective optimization.

Junbao Zhou, Yuan Zhou, Kesen Zhao, Qingshan Xu, Beier Zhu, Richang Hong, Hanwang Zhang2026-03-10💻 cs

PAD-TRO: Projection-Augmented Diffusion for Direct Trajectory Optimization

This paper introduces PAD-TRO, a novel direct trajectory optimization framework that integrates a gradient-free projection mechanism into the reverse diffusion process to generate dynamically feasible state sequences, achieving zero dynamic feasibility errors and a significantly higher success rate in complex quadrotor navigation compared to existing single-shooting approaches.

Jushan Chen, Santiago Paternain2026-03-10💻 cs

EB-MBD: Emerging-Barrier Model-Based Diffusion for Safe Trajectory Optimization in Highly Constrained Environments

This paper introduces Emerging-Barrier Model-Based Diffusion (EB-MBD), a novel approach that integrates progressively introduced barrier functions inspired by interior point methods to overcome the sample inefficiency and catastrophic performance degradation of standard Model-Based Diffusion in highly constrained environments, achieving superior solution quality and computational efficiency without expensive projection operations.

Raghav Mishra, Ian R. Manchester2026-03-10💻 cs

CDE: Concept-Driven Exploration for Reinforcement Learning

This paper proposes Concept-Driven Exploration (CDE), a reinforcement learning framework that leverages a pre-trained vision-language model to generate object-centric concepts as noisy supervisory signals, using concept reconstruction accuracy as an intrinsic reward to guide efficient, targeted exploration in visual control tasks and achieve robust real-world transfer.

Le Mao, Andrew H. Liu, Renos Zabounidis, Yanan Niu, Zachary Kingston, Joseph Campbell2026-03-10💻 cs

Deliberative Dynamics and Value Alignment in LLM Debates

This paper investigates how different deliberation protocols (synchronous vs. round-robin) and model architectures influence value alignment and verdict revision in multi-turn LLM debates, revealing significant behavioral disparities where GPT-4.1 exhibits strong inertia and autonomy-focused reasoning while Claude 3.7 Sonnet and Gemini 2.0 Flash demonstrate greater flexibility, empathy, and susceptibility to order effects.

Pratik S. Sachdeva, Tom van Nuenen2026-03-10💻 cs

Reallocating Attention Across Layers to Reduce Multimodal Hallucination

This paper proposes a lightweight, training-free plugin called Functional Head Identification and Class-Conditioned Rescaling that mitigates multimodal hallucinations in large reasoning models by adaptively rebalancing perception and reasoning contributions across layers, achieving significant performance gains with minimal computational overhead.

Haolang Lu, Bolun Chu, WeiYe Fu, Guoshun Nan, Junning Liu, Minghui Pan, Qiankun Li, Yi Yu, Hua Wang, Kun Wang2026-03-10💻 cs

Preference-Conditioned Multi-Objective RL for Integrated Command Tracking and Force Compliance in Humanoid Locomotion

This paper proposes a preference-conditioned multi-objective reinforcement learning framework that enables a single humanoid locomotion policy to dynamically balance accurate command tracking with compliant responses to external forces, validated through stable training and successful deployment in both simulation and real-world experiments.

Tingxuan Leng, Yushi Wang, Tinglong Zheng, Changsheng Luo, Mingguo Zhao2026-03-10💻 cs

Unsupervised Deep Generative Models for Anomaly Detection in Neuroimaging: A Systematic Scoping Review

This systematic scoping review synthesizes thirty-three studies on unsupervised deep generative models for neuroimaging anomaly detection, highlighting their potential for pathology-agnostic localization in data-scarce settings while identifying key challenges such as methodological heterogeneity and limited external validation.

Youwan Mahé, Elise Bannier, Stéphanie Leplaideur, Elisa Fromont, Francesca Galassi2026-03-10💻 cs

Taming Modality Entanglement in Continual Audio-Visual Segmentation

This paper introduces the Continual Audio-Visual Segmentation (CAVS) task and proposes a Collision-based Multi-modal Rehearsal (CMR) framework that effectively addresses multi-modal semantic drift and co-occurrence confusion through novel sample selection and frequency adjustment strategies, significantly outperforming existing single-modal continual learning methods.

Yuyang Hong, Qi Yang, Tao Zhang, Zili Wang, Zhaojin Fu, Kun Ding, Bin Fan, Shiming Xiang2026-03-10💻 cs

PolyJailbreak: Cross-Modal Jailbreaking Attacks on Black-Box Multimodal LLMs

This paper introduces PolyJailbreak, a novel black-box framework that exploits multimodal safety asymmetries through a structured library of atomic strategies and reinforcement learning-based multi-agent optimization to achieve significantly higher jailbreak success rates on state-of-the-art multimodal large language models compared to existing methods.

Xinkai Wang, Beibei Li, Zerui Shao, Ao Liu, Guangquan Xu, Shouling Ji2026-03-10💻 cs