Multi-Stream Perturbation Attack: Breaking Safety Alignment of Thinking LLMs Through Concurrent Task Interference
This paper proposes a "multi-stream perturbation attack" that exploits vulnerabilities in the step-by-step reasoning of thinking-mode LLMs by interweaving multiple task streams to disrupt safety alignment, causing high attack success rates and inducing reasoning collapse or repetitive outputs across various models.