Today’s large language models (LLMs) all share a structural bottleneck: they generate tokens sequentially. One. At. A. Time. Models such…