Tensor-role sensitivity vs context length

success

0.68

1/5

Consensus Metrics

k_proj_roi_2k 0.259 (n=1, σ=0)

k_proj_roi_16k 0.468 (n=1, σ=0)

kv_ratio_2k 1.85 (n=1, σ=0)

kv_ratio_16k 2.49 (n=1, σ=0)

Parameters

quant gptq_turbo_q4

group_size 256

protect_role each-of-7

protect_method fp16

eval_seq_len_grid [2048

Hypothesis

Softmax amplifies K-side errors more than V-side errors; the gap should grow with context length