August 7, 2025

Offline Mode for the Parca Agent

Introduction

CPU profiling with Parca involves two main components: the Parca agent, which runs as a process called parca-agent on the host being profiled, and the Parca backend, which runs as a process called parca on any host; possibly not the same one. The agent continuously collects stack traces of the code scheduled on the CPU. It then periodically sends the collected stack traces to the backend, where they are stored in a database for future retrieval.

The design of Polar Signals Cloud is exactly analogous: it features a more performant and scalable backend, but it communicates with the Parca agent in the same way. Thus, in this post, "the backend" should be taken to mean either the Parca backend or the Polar Signals Cloud backend.

The Motivation for Offline Mode

In the past, the agent has typically communicated with the backend over the network, and a lost network connection usually implies that data collected while the network was down will not be reported. In a typical modern server workload, this is acceptable: a host losing network connectivity is a rare scenario that means the host is basically useless anyway.

But the world of computing is broader than just servers, and the Parca team would like our software to be useful in other kinds of deployments as well. In the modern world, many computerized devices are either never connected to the internet or only unreliably connected: this includes everything from smartphones to autonomous vehicles.

Thus, we decided to develop Offline Mode: a new feature for the Parca agent allowing it to save data locally and upload it for further processing later.

How It Works

Recording the Data

In traditional operation ("online" mode), the agent communicates with the backend via the following stateful protocol: first, it uploads a list of stack IDs (computed by hashing the stacks themselves) along with a count of how many times each stack ID occurred. The backend then responds with a list of IDs that it needs the full stack trace for, and finally, the agent responds with these. This allows the backend to cache stacks it's already seen, decreasing network traffic.

In offline mode, every five seconds, rather than sending anything to the backend, we record two records in a file (each prefixed with their size in bytes): first, the stack IDs and their counts; second, the full stacks for any IDs that have not yet been recorded in the same file. We then call fsync to ensure data persistence, and finally, update the count of batches in the header of the file.

This format is self-describing and resistent to crashes: since the batch count is not updated until after the batch is synced to disk, an attempt to read a partially-written file will only see atomically written batches (though it might miss an entire final batch if it was in the process of being written when the agent process terminated).

Every ten minutes, the storage file is rotated: it is compressed using ZSTD to reduce storage cost, and a new file is started. The files are saved with the scheme {timestamp}-{pid}.padata so that later they can be read in timestamp order.

Uploading the Data

Later, data may be uploaded whenever and from wherever the user chooses to do so; this does not necessarily have to be done on the same machine where the data was recorded, as long as the uploading machine has access to the storage directory where the files were written.

The uploader reads files from the storage in the order they were written (sorting using the timestamp in the filename). It uploads samples to the backend using the same protocol as the agent does during normal operation, using the full stacks (the second record in each batch) to respond to responses from the backend requesting them. After each file is successfully uploaded, the uploader removes it from the storage directory, so it can pick up where it left off if it's interrupted.

Try It Out

If you want to profile an x86-64 or aarch64 Linux installation that has reliable access to storage but not to the network, the Parca agent's offline mode might be just what you're looking for. To try it out, run parca-agent with --offline-mode-storage-path=/path/to/storage to begin collecting profiling data locally. The agent will create .padata files in the specified directory, rotating and compressing them every 10 minutes by default.

When you're ready to upload the collected data, run parca-agent with both --offline-mode-storage-path=/path/to/storage and --offline-mode-upload, along with your usual backend configuration flags (like --remote-store-address). The uploader will process all files in timestamp order and remove them after successful upload. This doesn't have to be done on the same machine as collection: nothing stops you from copying the /path/to/storage directory to anywhere that is capable of maintaining a network connection and running parca-agent.

We hope this is useful. Happy profiling!

Discuss: