Information Capacity: Evaluating the Efficiency of Large Language Models via Text Compression
This paper introduces "information capacity," a novel metric that evaluates large language model efficiency by measuring text compression performance relative to computational complexity while accounting for tokenizer efficiency, thereby enabling accurate performance prediction and revealing linguistic biases across diverse models.