Proact-VL: A Proactive VideoLLM for Real-Time AI Companions
This paper introduces Proact-VL, a general framework designed to transform multimodal language models into proactive, real-time AI companions that overcome latency and decision-making challenges, validated through the new Live Gaming Benchmark across commentary and guidance scenarios.