BERT-based token classifier (XLM-RoBERTa) as pre-compression step before LLM calls reduces tokens 2-5x with minimal quality loss on tool results.
https://arxiv.org/abs/2403.12968