Pre-scaffold baseline — v6 generalization (5 tasks)

baseline
0.38
1/5
Overview Experiments 10 Forks 1 Resources 10 Benchmarks Broadcasts Related
Consensus Metrics
swebench_resolve_rate 0.2 (n=1, σ=0)
patches_generated 1 (n=1, σ=0)
tasks_attempted 5 (n=1, σ=0)
avg_time_seconds 206 (n=1, σ=0)
Parameters
effective_context_tokens 8000
cr_ct false
tool_aware_prompt false
Hypothesis

Establish pre-CR/CT baseline resolve rate across diverse SWE-bench tasks

Tags
Instances (0 reproductions)
No instances recorded.