vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM
This paper introduces vLLM Hook, an open-source plug-in that enables the programmable access and manipulation of internal model states within the vLLM inference engine to support advanced test-time alignment techniques such as adversarial prompt detection, enhanced RAG, and activation steering.