The Cost of Go's Interfaces and How to Fix It
Why dynamic dispatch is necessary and how to fix it with devirtualizing and PGO.
We'll start with the fundamentals: What are dynamic dispatch and devirtualization in Go in regards to interfaces?
In Go, when let's say a function accepts a parameter that's an interface, and a function ends up being called on that parameter, Go needs to first figure out what the concrete function is that needs to be executed, as it doesn't know the concrete type, which is the whole point of interfaces in Go. This is referred to as dynamic dispatch.
Let's have a look at a small example:
This piece of code has a main function, that instantiates an instance of ConcreteType, which implements the TestInterface by defining and binding the Something() function to it, which itself is a no-op. It passes the instance to the function AcceptsInterface, which has the parameter i of type TestInterface. AcceptsInterface calls Something() 1 million times on the passed parameter i. Because AcceptsInterface doesn't know the concrete type, it has to first figure out what concrete implementation of Something() every time for each of those 1 million executions.
What's the impact of dynamic dispatch? Let's benchmark it!
Here's a simple benchmark.
Let's run it.
Ok, interesting result, the first run shows some 13 B/op? Let's have a look at the memory profile.
Alright looks like it was just go test framework things and profiling, the rest don't allocate, so our code doesn't do any heap allocations - phew!
Now the most dramatic way to demonstrate the cost of dynamic dispatch is by type-asserting.
And rerun the benchmark.
And compare.
Wow! A ~66% improvement. Of course this is a synthetic example to demonstrate the cost of dynamic dispatch, but I think we've shown there is overhead.
When the compiler applies this optimization by itself, then that's called devirtualization.
Note: Don't do this at home, as it's a type assertion that if not successful, it will panic. If you do, only ever use a type switch with a fallback.
Enter Profile-Guided Optimizations (PGO)
Now, wouldn't it be nice if we didn't have to do apply this optimization ourselves? It turns out with Go 1.21 profile-guided optimizations (PGO) is now generally available. PGO can be summarized as using profiling data to inform the compiler to perform optimizations that wouldn't generally be good or known, but thanks to profiling data we know they are possible and will be good.
Let's give it a spin. All we need to do is either have a CPU profile that's called default.pgo, or pass a file via the -pgo flag. We'll undo the type-assertion and use the profiling data we took from our previous run.
And compare to the initial run.
Wow nice, we didn't have to modify our code and still got a 50% improvement! Why "only" 50%? Compared to the type-assertion, that would have panic'ed in the event of the concrete type not being the one we assert to, the devirtualization optimizer of course ensures that our code would still function correctly if we didn't have this concrete type.
The way this works is that the Go compiler knows that the concrete implementation is one that's being called in practice, thanks to the provided profiling data, and therefore inserts the type switch to devirtualize automatically.
What's next?
We've learned that dynamic dispatch can have significant cost in Go, but remember never prematurely optimize without measuring that the optmization is worth it. With PGO we can automate it and don't have to think about it or search for the cases where it's worth it. PGO is still very new in the Go compiler toolchain, and while it's impressive already, it's still evolving quickly, so I was happy to see that while I was writing this blog post, a new optimization was implemented, which combines function inlining with devirtualizing.
Lastly, there has always been a bit of a UX issue with PGO, and that is: How can you get representative profiling data from production? The answer is: Use a continuous profiler! And as it so happens the Parca open-source project and Polar Signals Cloud are currently the only documented solutions supporting producing profiling data suitable for Go's PGO.
You can start a free 14-day trial today and try for yourself with our zero-instrumentation eBPF-based profiler, deployment only takes seconds!
Start Your Free 14-Day Trial
Build faster, more reliable, and cost-efficient software starting today!
Read more

Keep up with Polar Signals
Receive new posts, product updates, and insights on performance engineering straight to your inbox.