DVD-Quant: Data-free Video Diffusion Transformers Quantization

This paper introduces DVD-Quant, a novel data-free post-training quantization framework for Video Diffusion Transformers that utilizes Bounded-init Grid Refinement, Auto-scaling Rotated Quantization, and δ\delta-Guided Bit Switching to achieve a 2×\times speedup and enable W4A4 quantization without compromising visual fidelity.

Zhiteng Li, Hanxuan Li, Junyi Wu, Kai Liu, Haotong Qin, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang2026-03-09💻 cs

Instance Data Condensation for Image Super-Resolution

This paper introduces Instance Data Condensation (IDC), a novel framework utilizing Random Local Fourier Feature Extraction and Multi-level Feature Distribution Matching to synthesize a highly compact (10% volume) dataset for Image Super-Resolution that achieves performance comparable to the original full dataset while significantly reducing computational and storage requirements.

Tianhao Peng, Ho Man Kwan, Yuxuan Jiang, Ge Gao, Fan Zhang, Xiaozhong Xu, Shan Liu, David Bull2026-03-09💻 cs

A Multi-Agent System Enables Versatile Information Extraction from the Chemical Literature

This paper presents a multimodal large language model-based multi-agent system that significantly outperforms existing state-of-the-art methods in automatically extracting structured chemical information from diverse and complex literature graphics, thereby advancing AI-driven chemical research.

Yufan Chen, Ching Ting Leung, Bowen Yu, Jianwei Sun, Yong Huang, Linyan Li, Hao Chen, Hanyu Gao2026-03-09🤖 cs.AI

MAP: Mitigating Hallucinations in Large Vision-Language Models with Map-Level Attention Processing

This paper introduces MAP, a training-free decoding method that mitigates hallucinations in Large Vision-Language Models by interpreting hidden states as a 2D semantic map and employing layer-wise criss-cross attention and global-local logit fusion to aggregate widely distributed factual information for improved factual consistency.

Chenxi Li, Yichen Guo, Benfang Qian, Jinhao You, Kai Tang, Yaosong Du, Zonghao Zhang, Xiande Huang2026-03-09🤖 cs.AI

SGDFuse: SAM-Guided Diffusion Model for High-Fidelity Infrared and Visible Image Fusion

The paper proposes SGDFuse, a novel two-stage conditional diffusion model guided by Segment Anything Model (SAM) semantic masks, which achieves high-fidelity infrared and visible image fusion by leveraging explicit semantic priors to preserve key targets and minimize artifacts for superior downstream task performance.

Xiaoyang Zhang, jinjiang Li, Guodong Fan, Yakun Ju, Linwei Fan, Jun Liu, Alex C. Kot2026-03-09🤖 cs.AI