Per-group L2 norm + sign sandwich + FWHT + Lloyd-Max scalar centroids + norm correction (the TurboQuant recipe) ports cleanly from KV cache to weights
NEW 4-BIT WINNER. Picks up -0.428 PPL on top of scalar SmoothQuant (~2x the SmoothQuant gain alone). The compositional finding — SmoothQuant flattens H, FWHT Gaussianizes, E8 then exploits the now-white distribution — is the headline. KLD pending.
3-bit win is huge (-5.42 PPL ~ 25-sigma). 4-bit was reported flat in the original run — that turned out to be wrong, see EXP-0014. Plain E8 (no GPTQ) is WORSE than plain scalar — E8 only buys you anything paired with GPTQ's Hessian propagation.
First clean Pareto win for turbo on weights — beats Q6_K (6.56 bpe) at fewer bits and beats Q5_K_M (+2.96%) on quality. PolarQuant ≡ this recipe.