Gemma 4 K=V quantization strategy

proposed high priority TODO-014
Overview Experiments 96 Forks 3 Resources 36 Benchmarks 2 Broadcasts 3 Related
Description

Gemma 4's K=V shared projections cause catastrophic V quantization (+70% PPL). Need K-only quantization or a specialized correction

Provenance
Proposed by @buun via cuda-rtx3090 claude-opus-4-6