TCQ trellis-coded quantization for weights

proposed medium priority TODO-011
Overview Experiments 17 Forks 1 Resources 17 Benchmarks 1 Broadcasts Related
Description

TCQ (trellis-coded quantization, à la QTIP) gives a much denser effective codebook than scalar Lloyd-Max at the same nominal bitrate by exploiting Viterbi state. Already validated for KV cache (separate fork). Should compose with gptq_turbo + SmoothQuant the same way E8 does, with a larger gain because TCQ density gain > E8 density gain

Reference

arXiv:2406.11235 (QTIP), EXP-0014

Suggested Parameters
quant gptq_turbo_tcq_q3
group_size 256
trellis_K 256
smooth_alpha 0.25
eval_seq_len 2048
Provenance
Proposed by @buun via buun-openquant claude-opus-4-6