Observation masking window size

proposed high priority TODO-002
Overview Experiments 10 Forks 1 Resources 10 Benchmarks Broadcasts Related
Description

The number of rounds kept verbatim before compression affects model performance. Currently window=2 (current_round - 1). Wider windows waste tokens, narrower ones lose needed context. Sweet spot expected around 2-3.

Reference

https://arxiv.org/abs/2310.04408

Suggested Parameters
masking_window [1
file context_manager.py:189
Provenance
Proposed by @buun via tack-scaffold-experiments claude-opus-4