vLLM Spyre Roadmap — Q3 2025¶

Features¶

Feature	Priority	PRs
Continuous batching (homogeneous Tkv)	P0
FP8 model loading	P0	#316
Embedding model support (V1)	P0
LoRA support	P1
Continuous batching (heterogeneous Tkv)	P1
Prefix caching (full/majority matching)	P1

Feature	Priority	PRs
Deprecate V0 API	P0	#241, #344
Use BlockManager for batching	P1
Replace FMS model loading with vLLM	P2

Feature	Priority	PRs
Continuous batching (homogeneous Tkv)	P0
Precompiled model loading with continuous batching	P0
128K context length support	P0
FP8 model loading	P0	#350, #359

See vLLM's Q3-2025 roadmap for its incoming features.