LinearSR: Unlocking Linear Attention for Stable and Efficient Image Super-Resolution

This paper introduces LinearSR, a holistic framework that enables stable and efficient photorealistic image super-resolution by overcoming linear attention's historical training instability and perception-distortion trade-off through novel strategies like ESGF, SNR-based MoE, and TAG, achieving state-of-the-art quality with exceptional computational efficiency.

Xiaohui Li, Shaobin Zhuang, Shuo Cao + 6 more2026-03-03💻 cs

See the Speaker: Crafting High-Resolution Talking Faces from Speech with Prior Guidance and Region Refinement

This paper presents a novel method for generating high-resolution, high-quality talking face videos exclusively from a single speech input by utilizing a speech-conditioned diffusion model with statistical facial priors, region-enhanced lip synchronization, and a Transformer-based discrete codebook for end-to-end detail refinement.

Jinting Wang, Jun Wang, Hei Victor Cheng + 1 more2026-03-03⚡ eess

ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning

The paper introduces ThinkMorph, a unified model fine-tuned on high-quality interleaved reasoning traces that treats text and image thoughts as complementary modalities, achieving significant performance gains on vision-centric benchmarks and demonstrating emergent multimodal intelligence such as adaptive reasoning and unseen visual manipulation skills.

Jiawei Gu, Yunzhuo Hao, Huichen Will Wang + 5 more2026-03-03💻 cs