pip install kvboost KVBoost Faster LLM Inference.Less VRAM. No Model Changes. Chunk-level KV cache reuse · FlashAttention-2 · AWQ layer streaming · CPU paged decoding
First seen: 2026-05-22 06:13
Last seen: 2026-05-22 07:14