Efficient LLMs at Scale: My NeurIPS Week in KV Caches, Spec Decoding, and FP4 Various · The Kaitchup ·Dec 14, 2025 · 24 min read
Choosing a GGUF Model: K-Quants, I-Quants, and Legacy Formats Various · The Kaitchup ·Oct 13, 2025 · 10 min read