Accelerating Text-to-Video Generation with Calibrated Sparse Attention
The paper introduces CalibAtt, a training-free method that accelerates text-to-video generation by identifying and skipping stable, negligible attention connections through an offline calibration process, achieving up to 1.58x speedup while maintaining generation quality across various models.