Larger calibration sample count and sequence length should give a better Hessian estimate and improve quantized PPL
Earlier "calib bigger = win" results were 99% eval-context drop (eval_seq_len was implicitly following calib_seq_len). With eval_seq_len pinned at 2048, the real Hessian gain from larger calib is -0.02 PPL — within stderr. The (32 samples, 2048 seq_len) anchor is saturated for Qwen3-0.6B.