q_norm/k_norm sensitivity probe — q8 free, q4 too expensive

inconclusive
0.68
1/5
Overview Experiments 17 Forks 1 Resources 17 Benchmarks 1 Broadcasts Related
Consensus Metrics
fp16_ppl 19.21 (n=1, σ=0)
q8_ppl 19.15 (n=1, σ=0)
q4_neuqi_ppl 20.59 (n=1, σ=0)
Parameters
quant gptq_turbo_q4
group_size 256
protect_role k_proj
norm_quant_grid [fp16
norm_group_size 16
eval_seq_len 2048
Hypothesis

q_norm/k_norm RMSNorm tensors are tiny but sit in the attention path — sensitivity should be asymmetric to parameter count

Tags
Subject
Model: qwen3-0.6b Dataset: wikitext-2
Baseline Comparison
q8_delta_pct -0.30% q4_delta_pct +7.20%
Dependencies
Instances (0 reproductions)
No instances recorded.