Protecting k_proj at Q8_0 (instead of fp16) cuts the bpe overhead 3× while preserving the recovery, because GPTQ Hessians see the actual Q8 values that will run at inference (system is internally self-consistent)
First strict Pareto win on BOTH axes vs Q4_K_M — 0.51 fewer bits AND 0.25 lower PPL. Protection ROI 1.236 PPL/bpe at 4-bit, 6.85 at 3-bit (k errors are exponentiated by softmax — the noisier the rest, the more headroom).