Top Experiments
| Title | Result | Confidence | Repro | Metrics |
|---|---|---|---|---|
|
Initial CR/CT implementation — v7
CR/CT round compression reduces context pressure and improves solve time
|
inconclusive |
1/5
|
||
|
Emergency compress order flip — v7b
Flipping emergency compress order (CT collapse first, then tool result compression) preserves recent verbatim results longer
|
inconclusive |
1/5
|
||
|
Pre-scaffold baseline — v6 generalization (5 tasks)
Establish pre-CR/CT baseline resolve rate across diverse SWE-bench tasks
|
baseline |
1/5
|
||
|
Tiered CR compression (S1/S2/S3) — v9
Different reasoning lengths need different compression levels. S1 (4-6 sentences, ≥800ch), S2 (2-3 sentences, ≥400ch), S3 (1 sentence, <400ch) — all generated in one LLM call, code picks appropriate tier
|
success |
1/5
|
||
|
Baseline — Tiered CR compression + TurboQuant KV cache at 8k effective context
Establish baseline performance with tiered CR compression (S1/S2/S3), CT emergency collapse, and 8k effective context window on Qwen3.5-27B
|
success |
1/5
|