The Concept Allocation Zone: Tracking How Concepts Form Across Transformer Depth
This paper introduces the Concept Allocation Zone (CAZ), a framework that redefines concept formation in transformer models as a depth-extended process occurring across a contiguous region of the residual stream rather than at a single "best" layer, utilizing new metrics to identify these zones and revealing that many concepts reside in subtle, multimodal allocation regions that are causally active yet invisible to standard peak detection methods.