Tracking iteration improvements on a single control task shows scaffold optimization impact
Key interventions ranked by impact — tool-aware prompt (-80.5%), tiered CR (-21.5%), emergency order flip (-5.7%). 10k context adds overhead on easy tasks due to model re-reading behavior.