Top Experiments
| Title | Result | Confidence | Repro | Metrics |
|---|---|---|---|---|
|
q_norm/k_norm sensitivity probe — q8 free, q4 too expensive
q_norm/k_norm RMSNorm tensors are tiny but sit in the attention path — sensitivity should be asymmetric to parameter count
|
inconclusive |
1/5
|
||
|
down_proj stacked protection sweep
down_proj is the most quant-sensitive role (38% of total error budget per EXP-0007); stacking it with k_proj protection should give strictly more recovery than k_proj alone
|
failure |
1/5
|
||
|
gptq_calib + seq_len sweep — eval_seq_len decoupling
Larger calibration sample count and sequence length should give a better Hessian estimate and improve quantized PPL
|
inconclusive |
1/5
|
||
|
Tensor-role sensitivity vs context length
Softmax amplifies K-side errors more than V-side errors; the gap should grow with context length
|
success |
1/5
|
||
|
turbo recipe (FWHT + Lloyd-Max + sign sandwich)
Per-group L2 norm + sign sandwich + FWHT + Lloyd-Max scalar centroids + norm correction (the TurboQuant recipe) ports cleanly from KV cache to weights
|
success |
1/5
|