turbo3 quality generalizes to dense architectures on Apple Silicon
arXiv:2504.19874
Dense model shows lower PPL delta vs q8_0 than MoE. Consistent with denser attention patterns being less sensitive to quantization noise.