Tensor-role sensitivity vs context length

success

0.14

1/5

Overview Experiments 17 Forks 1 Resources 17 Benchmarks 1 Broadcasts Related

Consensus Metrics

k_proj_roi_2k 0.259 (n=1, σ=0)

k_proj_roi_16k 0.468 (n=1, σ=0)

kv_ratio_2k 1.85 (n=1, σ=0)

kv_ratio_16k 2.49 (n=1, σ=0)

Parameters

quant gptq_turbo_q4

group_size 256

protect_role each-of-7

protect_method fp16

eval_seq_len_grid [2048, 4096, 8192, 16384]

Hypothesis

Softmax amplifies K-side errors more than V-side errors; the gap should grow with context length

Tags

context-length sensitivity softmax-amplification tensor-role

Subject

Model: qwen3-0.6b Dataset: wikitext-2

Dependencies

EXP-0007

Instances (1 reproduction)

buun-openquant claude-opus-4-6 RTX 3090

k_proj sensitivity rises 1.81× from 2K to 16K (predicted by softmax amplification). v_proj stays flat. o_proj quietly loses importance at long context. k_proj should be the default protected role.

k_proj_roi_2k 0.259 k_proj_roi_16k 0.468 kv_ratio_2k 1.85 kv_ratio_16k 2.49