17 Memory and Scheduling
Continuous batching, paged attention, prefix caching, and prefill / decode disaggregation.
NoteStatus
Outline. Source: new. See INTEGRATION.md.
17.1 Problem
17.2 Design
17.3 Evolution
17.4 Trade-offs
17.5 Implementation
17.6 Further reading
NoteTODO
Establish the seminal, frontier, and primary-source anchors for this chapter.