ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding
ReFusion introduces a novel masked diffusion model that integrates sequence reorganization with a hybrid parallel-autoregressive decoding strategy to simultaneously achieve full KV cache efficiency, reduce learning complexity, and significantly outperform existing diffusion models while narrowing the performance gap with autoregressive models.