
Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space
Authors
Abstract
Large Language Models (LLMs) apply uniform computation to all tokens, despite language exhibiting highly non-uniform information density. This token-uniform regime wastes capacity on locally predictable spans while under-allocating computation to semantically critical transitions.
We propose $Dynamic Large Concept Models (DLCM)$, a hierarchical language modeling framework that learns semantic boundaries from latent representations and shifts computation from tokens to a compressed concept space where reasoning is more efficient. DLCM discovers variable-length concepts end-to-end without relying on predefined linguistic units.
Hierarchical compression fundamentally changes scaling behavior. We introduce the first $compression-aware scaling law$, which disentangles token-level capacity, concept-level reasoning capacity, and compression ratio, enabling principled compute allocation under fixed FLOPs. To stably train this heterogeneous architecture, we further develop a $decoupled $μ$P parametrization$ that supports zero-shot hyperparameter transfer across widths and compression regimes. At a practical setting ($R=4$, corresponding to an average of four tokens per concept), DLCM reallocates roughly one-third of inference compute into a higher-capacity reasoning backbone, achieving a $+2.69$%$ average improvement$ across 12 zero-shot benchmarks under matched inference FLOPs.