Diffusion Alignment as Variational Expectation-Maximization
The paper introduces Diffusion Alignment as Variational Expectation-Maximization (DAV), an iterative framework that alternates between test-time search for diverse, reward-aligned samples and model refinement to optimize diffusion models for downstream objectives while mitigating reward over-optimization and mode collapse.