GPTQ + turbo composition

success

0.14

1/5

Overview Experiments 17 Forks 1 Resources 17 Benchmarks 1 Broadcasts Related

Consensus Metrics

perplexity 21.08 (n=1, σ=0)

bits_per_param 4.125 (n=1, σ=0)

act_order_on_ppl 19.55 (n=1, σ=0)

act_order_off_ppl 19.54 (n=1, σ=0)

Parameters

quant gptq_turbo_q4

group_size 128

calib_samples 32

calib_seq_len 2048

gptq true

fwht true

eval_seq_len 2048

Show all 7 params

Hypothesis

Replacing GPTQ's per-column scalar quantizer with turbo as the inner block quantizer composes well — GPTQ's Hessian-corrected weights pre-align for turbo's rounding, FWHT Gaussianization makes the Lloyd-Max grid usable on weights it normally clips

Reference

arXiv:2210.17323