Higher quality CUDA KV cache quantization reduces context rot, improving convergence at 10k effective context
CUDA quality fix halved time (247s vs 492s, 12 vs 20 rounds). Still slower than 8k baseline. Model found bug at round 3, didn't edit until round 18 — re-read same file areas 3x with different limits. Needs re-read suppression prompt fix.