Reducing constant memory addresses from 8 to 4 via magnitude-only LUT with XOR sign recovery improves decode on pre-M5 Apple Silicon
Best of 14 approaches tested. Halves constant addresses (4 vs 8). Sweet spot on Apple8 where 4 divergent reads beat any arithmetic.