L1 instruction cache set conflicts, associativity, and code alignment in Go

https://lobste.rs/rss Hits: 7
Summary

A regression in code I didn't touch A deep dive into L1 instruction cache set conflicts, associativity, and code alignment in Go. May 19, 2026 In my previous post I briefly touched on the topic of how merely shifting the code by a couple of bytes may significantly affect hot path performance. CPUs are weird. They don't just take instructions and run them in order. There are caches, branch predictors, prefetchers, yada yada, and all of it is sensitive to where exactly your code sits in memory. The same hot loop at one address can be a few percent slower at another, just because it crossed some invisible boundary somewhere. Every cache you can find around cpu is a potential subject of unexpected performance regressions (or gains) inflicted by code alignment changes. The hero of this post is L1 icache - the fastest cpu cache that stores cpu instructions. On my machine (Intel i5-12500) it's 32KB, 8-way set associative: 64 sets × 8 ways × 64-byte cachelines. Those numbers matter for the story. In this post I want to tell you an interesting anecdote about the case where I spent a couple hours investigating why a change in one piece of code caused a performance regression in a completely unrelated part of the codebase and the root cause was, surprisingly, L1i conflict misses from limited cache associativity. The Phantom Regression I was working on improving compression speed of quality level 2 in my Brotli Go port go-brrr. go version go1.26.2-X:nodwarf5 linux/amd64 goos: linux goarch: amd64 pkg: github.com/molecule-man/go-brrr cpu: 12th Gen Intel(R) Core(TM) i5-12500 │ /tmp/before.txt │ /tmp/after.txt │ │ B/s │ B/s vs base │ 830kb.so.css 297.8Mi ± 0% 304.0Mi ± 0% +2.08% (p=0.000 n=21) 005kb.webp.js 126.8Mi ± 1% 122.7Mi ± 0% -3.24% (p=0.000 n=21) 011kb.quer.json 348.5Mi ± 0% 344.8Mi ± 0% -1.08% (p=0.000 n=21) The speed of large files compression is improved (expected). However, performance on small files regressed by 3% - completely unexpected as my change touched hash2.go ...

First seen: 2026-05-21 15:01

Last seen: 2026-05-21 21:07