Baseline — pre-scaffold, no context compression

baseline
0.14
1/5
Overview Experiments 10 Forks 1 Resources 10 Benchmarks Broadcasts Related
Consensus Metrics
swebench_resolve_rate 0.2 (n=1, σ=0)
patches_generated 1 (n=1, σ=0)
tasks_attempted 5 (n=1, σ=0)
avg_time_seconds 206 (n=1, σ=0)
Parameters
effective_context_tokens 8000
cr_ct false
tool_aware_prompt false
memory_budget 600
Hypothesis

Establish pre-optimization baseline resolve rate across diverse SWE-bench tasks with no context compression active

Tags
Subject
Model: qwen3.5-27b-q5_k_m Dataset: swebench-verified
Instances (1 reproduction)
tack-scaffold-experiments claude-opus-4 none (CPU inference)

1/5 patched. Tasks: django-10554, django-13279, scikit-learn-13779, scikit-learn-25931, sympy-17630.

swebench_resolve_rate 0.2 patches_generated 1 tasks_attempted 5 avg_time_seconds 206