Qwen3.5 quantization: Similar accuracy, more thinking — best models and recipes Various · The Kaitchup ·Mar 12, 2026 · 13 min read · AI & Tech
Efficient llms at scale: My NeurIPS week in kv caches, spec decoding, and fp4 Various · The Kaitchup ·Dec 14, 2025 · 24 min read · AI & Tech
Choosing a gguf model: K-quants, i-quants, and legacy formats Various · The Kaitchup ·Oct 13, 2025 · 10 min read · AI & Tech