2-bit PolarQuant (4-centroid Lloyd-Max) provides maximum compression for VRAM-constrained scenarios
turbo2 = 2.5 bpv = 6.4x compression vs fp16. Uniform PPL +8% is too high for general use but acceptable for extreme memory pressure. Best mixed config: turbo2-K + turbo3-V at +3.9% (2.9 bpv average). Speed identical to turbo3 (both compute-bound). KV memory 20 MiB vs 28 MiB (turbo3) vs 128 MiB (f16) at 4K context. Validates "Physics of KV Compression" paper's 90% compression cliff warning. turbo2 implemented across 22 files. Useful as target for temporal decay (#36) where old tokens with negligible attention weight can tolerate higher error.