Correlating Tracing with Profiling using eBPF

March 5, 2024

At Polar Signals, we're constantly on the lookout for ways to enhance the developer experience and provide tools that not only simplify but also amplify your ability to understand and optimize your applications. Today, we're thrilled to announce a feature that has long been requested, and by many thought impossible using eBPF-based profiling. We implemented support in Parca Agent to extract the current trace ID when collecting stacks.

This new feature, introduced in Parca Agent v0.30.0, enables developers to correlate traces with profiles seamlessly. Not just that, previous approaches for this only allowed viewing per-service profiling data. However, with Parca Agent's system-wide profiling approach, this enables an entirely new use-case of viewing all CPU time used by an entire request throughout all services!

Viewing all CPU profiling data across services by querying data by the trace ID.

If you don't care about how it works, jump straight to the Getting Started section to learn how to use it today in your Go programs!

Why Trace ID Matters

In the realm of observability, tracing and profiling are two sides of the same coin, offering insights into application performance and behavior. Tracing provides a detailed view of request paths through services, helping identify latency and bottlenecks. Profiling, on the other hand, offers a granular look at resource usage (CPU, memory, etc.) across an application's components.

Until now, correlating specific traces with their corresponding profiles involved a manual and often cumbersome process. With the introduction of Trace ID support in Parca Agent, we bridge this gap, allowing for automatic correlation. This means developers can now jump directly from a trace to the related profile, gaining insights into what the application was doing at the exact moment captured by the trace.

How It Works

When Parca Agent collects profiling data, it now also captures the Trace ID associated with the observed stacktrace. The trace ID can then be searched by Parca (or Polar Signals Cloud) queries like any other label (using the trace_id label name).

Today we're introducing support for Go, but planning to add additional language support soon. In Go, this works using goroutine labels. As the name implies, goroutine labels are labels set on a goroutine. Goroutines are maintained by the Go runtime within thread-local storage. What this means is that since this is managed by the go runtime, we know the memory layout of how to find the current goroutine running on a thread, and therefore we know how to find the labels that have been set on it.

For simplicity, this will only discuss the x86 architecture, but there are replacements for the specifics on other architectures:

If you want to check out the non-shortened version of the code, see here.

First, we use the fsbase register to figure out where the thread-local storage address space starts.

tls_base = BPF_CORE_READ(task, thread.fsbase);

Then we add the offset of the runtime.tlsg symbol as described in the ELF binary of the target program. This is done by finding the ELF program of type PT_TLS and aligning the address. In the future, we will extract this automatically from the target binary, but for now, we've empirically determined that the offset is always 0xfffffffffffffff8 for Go binaries.

fsbase + 0xfffffffffffffff8 leads us to the pointer of the g (aka the goroutine).

bpf_probe_read_user(&g_addr, sizeof(void *), (void*)(tls_base+0xfffffffffffffff8));

From there we add 48 bytes, as that's the offset of the address that points to m (the operating system thread).

bpf_probe_read_user(&m_ptr_addr, sizeof(void *), (void*)(g_addr+48));

From the m we add 192 bytes, as that points to the curg (aka the current user-g assigned to m). This chain is necessary as if and only if g == curg are we not on the system stack, which won't have goroutine labels anyway (eg. the scheduler).

bpf_probe_read_user(&curg_ptr_addr, sizeof(void *), (void*)(m_ptr_addr+192));

Now we've found the currently user-code-executing goroutine! So we just read the pointer at 344 bytes offset, which is a regular Go map, which we read the otel.traceid label from.

bpf_probe_read_user(&labels_map_ptr_ptr, sizeof(void *), (void*)(curg_ptr_addr+344));

For brevity, I'll omit how to read a Go map, but it's just a minimal C implementation of reading Go maps. Check out the full code to understand how it works.

Note: Each of these offsets we extracted using DWARF debuginfo and the llvm-dwarfdump tool.

Getting Started

For the eBPF-based Parca-Agent to be able to read the trace ID from the target program, goroutine labels must be set by it, meaning we need a tiny amount of instrumentation. The good news is that this can be achieved with a few lines of code. We've prepared the `otel-profiling-go` module to help you get started even faster, but if you don't want to use yet-another-library, no worries, everything you need is available in the Go standard library! All you need to do is wrap the code you want the trace ID to be attached to within a pprof.Do wrapper, which sets the otel.traceid goroutine-label.

import (
	"context"
	"net/http"
	"runtime/pprof"

	"go.opentelemetry.io/otel/trace"
)

func myFunction(ctx context.Context) {
	traceID := trace.SpanContextFromContext(ctx).TraceID().String()
	pprof.Do(ctx, pprof.Labels("otel.traceid", traceID), func(ctx context.Context) {
		// Any code that is profiled within this wrapper will have its trace ID attached!
	})
}

Whether you use the library we put together or not, you want to set the goroutine labels as closely as possible to the first distributed tracing span within your process.

What's next?

For a start, this supports Go 1.22, but we will be adding support for more versions soon. We're also working on extending language support further, so vote for which language you would like to see next!

Acknowledgments

We thank the opentelemetry-go-instrumentation maintainers for already figuring out how to read Go maps from C, in particular, Eden Federman for being a sounding board as I was thinking through how to solve this.

We also thank Michael Pratt of the Go core team for helping me understand the importance of `g` vs `curg`!

Lastly, thanks to Felix Geisendörfer, for reporting and fixing a number of goroutine label issues in the Go runtime that made this work possible to be correct, which shipped in 1.18.

Discuss: