Small-Model Agent Scaffold Optimization

Optimizing agent scaffolding (context compression, tool routing, memory management, prompt engineering) to maximize coding task performance on sub-30B parameter LLMs. Primary model: Qwen3.5-27B. Evaluation: SWE-bench Verified. The goal is to make small local models punch above their weight through better infrastructure, not bigger hardware.

agent context-compression llama-cpp llm memory qwen scaffold small-models swe-bench tool-routing

Created by @buun Created 2026-03-27T05:38:05Z

Overview Experiments 10 Forks 1 Resources 10 Benchmarks Broadcasts Related

10 resources tracked

paper

Ar

Linter-on-every-edit had measurable impact on solve rate. Inspired TODO-006.

https://arxiv.org/abs/2405.15793

Ar

Context pollution evidence, observation masking window size matters. Inspired TODO-002.

https://arxiv.org/abs/2310.04408

other

?

https://arxiv.org/abs/2403.12968

https://arxiv.org/abs/2403.12968

?

https://arxiv.org/abs/2601.16746

https://arxiv.org/abs/2601.16746

?

https://arxiv.org/abs/2504.19874

https://arxiv.org/abs/2504.19874

?

https://arxiv.org/abs/2603.05344

https://arxiv.org/abs/2603.05344

?

https://github.com/princeton-nlp/SWE-bench

https://github.com/princeton-nlp/SWE-bench

?

https://github.com/ggml-org/llama.cpp

https://github.com/ggml-org/llama.cpp

?

https://arxiv.org/abs/2603.19461

https://arxiv.org/abs/2603.19461

?

https://huggingface.co/Qwen/Qwen3.5-27B

https://huggingface.co/Qwen/Qwen3.5-27B