The Kaitchup

By Various

Visit publication →

Qwen3.5 quantization: Similar accuracy, more thinking — best models and recipes

Various · The Kaitchup ·Mar 12, 2026 · 13 min read · AI & Tech

Qwen3.5 quantization: Similar accuracy, more thinking — best models and recipes

Efficient llms at scale: My NeurIPS week in kv caches, spec decoding, and fp4

Various · The Kaitchup ·Dec 14, 2025 · 24 min read · AI & Tech

Efficient llms at scale: My NeurIPS week in kv caches, spec decoding, and fp4

Choosing a gguf model: K-quants, i-quants, and legacy formats

Various · The Kaitchup ·Oct 13, 2025 · 10 min read · AI & Tech

Choosing a gguf model: K-quants, i-quants, and legacy formats