When Token Pruning is Worse than Random: Understanding Visual Token Information in VLLMs
This paper reveals that visual token information in Vision Large Language Models progressively vanishes at a depth-dependent "information horizon," beyond which existing pruning methods underperform random selection, leading to a novel strategy that integrates random pruning to achieve state-of-the-art efficiency without sacrificing accuracy.