LE-NeuS: Latency-Efficient Neuro-Symbolic Video Understanding via Adaptive Temporal Verification
LE-NeuS is a latency-efficient neuro-symbolic framework for long-form video question answering that achieves a significant reduction in inference latency (from 90x to ~10x compared to base VLMs) while preserving accuracy gains through CLIP-guided adaptive frame sampling and batched proposition detection.