AlphaFlowTSE: One-Step Generative Target Speaker Extraction via Conditional AlphaFlow
AlphaFlowTSE is a one-step conditional generative model for target speaker extraction that utilizes a JVP-free AlphaFlow objective and interval-consistency training to achieve high-fidelity speech recovery with low latency and improved generalization for downstream ASR tasks.