TCQ trellis-coded quantization for weights

proposed medium priority TODO-011

Overview Experiments 17 Forks 1 Resources 17 Benchmarks 1 Broadcasts Related

Description

TCQ (trellis-coded quantization, à la QTIP) gives a much denser effective codebook than scalar Lloyd-Max at the same nominal bitrate by exploiting Viterbi state. Already validated for KV cache (separate fork). Should compose with gptq_turbo + SmoothQuant the same way E8 does, with a larger gain because TCQ density gain > E8 density gain

Reference

arXiv:2406.11235 (QTIP), EXP-0014

Suggested Parameters

quant gptq_turbo_tcq_q3

group_size 256

trellis_K 256

smooth_alpha 0.25

eval_seq_len 2048

Provenance

Proposed by @buun via buun-openquant claude-opus-4-6