Setup Scraping for sending to Polar Signals Cloud

While the primary way to write data to Polar Signals Cloud is to use the agent, there may be reasons to want to write additional data that cannot be obtained via the agent.

Languages

Here is a table of Languages/runtimes and libraries that can be used to gather additional types of profiling data.

Language/runtime	Heap	Allocations	Blocking	Mutex Contention	Extra
Go	Yes	Yes	Yes	Yes	Goroutine, `fgprof`
Python	Yes	No	No	No
NodeJS	Yes	No	No	No
Rust (requires using jemalloc as the allocator)	Yes	No	No	No

Setup

The below manifests deploy the Polar Signals scraper, which discovers containers in a Kubernetes cluster and starts scraping them.

info

The configuration is identical to the Prometheus scrape config, Kubernetes is just used as an example here.

kubectl apply -f polarsignals-scraper.yaml

---
apiVersion: v1
kind: Namespace
metadata:
  name: polarsignals
  labels:
    pod-security.kubernetes.io/audit: privileged
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/warn: privileged
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: profile-scraper
    app.kubernetes.io/instance: polarsignals-scraper
    app.kubernetes.io/name: polarsignals-scraper
  name: polarsignals-scraper
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app.kubernetes.io/component: profile-scraper
    app.kubernetes.io/instance: polarsignals-scraper
    app.kubernetes.io/name: polarsignals-scraper
  name: polarsignals-scraper
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: polarsignals-scraper
subjects:
- kind: ServiceAccount
  name: polarsignals-scraper
  namespace: polarsignals
---
kind: ConfigMap
metadata:
  name: polarsignals-scraper-config
  namespace: polarsignals
apiVersion: v1
data:
  polarsignals-scraper.yaml: |-
    object_storage:
      bucket:
        type: "FILESYSTEM"
        config:
          directory: "./data"
    # MODIFY THIS TO YOUR NEEDS
    scrape_configs:
    - job_name: 'kubernetes-pods'
      kubernetes_sd_configs:
      - role: pod
      profiling_config:
        pprof_config:
          memory:
            keep_sample_type:
              - type: inuse_space
                unit: bytes
          process_cpu:
            enabled: false
      relabel_configs:
      - action: labelmap
        regex: __meta_kubernetes_pod_label_app_kubernetes_io_(.+)
        replacement: "app_kubernetes_io_$1"
      - source_labels: [__meta_kubernetes_namespace]
        action: replace
        target_label: namespace
      - source_labels: [__meta_kubernetes_pod_name]
        action: replace
        target_label: pod
      - source_labels: [__meta_kubernetes_pod_container_name]
        action: replace
        target_label: container
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: profile-scraper
    app.kubernetes.io/instance: polarsignals-scraper
    app.kubernetes.io/name: polarsignals-scraper
  name: polarsignals-scraper
  namespace: polarsignals
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/component: profile-scraper
      app.kubernetes.io/instance: polarsignals-scraper
      app.kubernetes.io/name: polarsignals-scraper
  template:
    metadata:
      labels:
        app.kubernetes.io/component: profile-scraper
        app.kubernetes.io/instance: polarsignals-scraper
        app.kubernetes.io/name: polarsignals-scraper
        app.kubernetes.io/version: v0.20.0
    spec:
      securityContext:
        fsGroup: 65534
        runAsUser: 65534
      containers:
      - args:
        - /parca
        - --config-path=/var/polarsignals-scraper/polarsignals-scraper.yaml
        - --store-address=grpc.polarsignals.com:443
        - --bearer-token-file=/var/polarsignals-secret/token
        - --mode=scraper-only
        image: ghcr.io/parca-dev/parca:v0.20.0
        livenessProbe:
          exec:
            command:
            - /grpc_health_probe
            - -v
            - -addr=:7070
          initialDelaySeconds: 5
        name: polarsignals-scraper
        ports:
        - containerPort: 7070
          name: http
        readinessProbe:
          exec:
            command:
            - /grpc_health_probe
            - -v
            - -addr=:7070
          initialDelaySeconds: 10
        terminationMessagePolicy: FallbackToLogsOnError
        resources: {}
        volumeMounts:
        - mountPath: /tmp
          name: tmp
        - mountPath: /var/polarsignals-scraper
          name: polarsignals-scraper-config
        - mountPath: /var/polarsignals-secret
          name: token
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: polarsignals-scraper
      tolerations:
      - effect: NoSchedule
        operator: Exists
      - effect: NoExecute
        operator: Exists
      volumes:
      - emptyDir: {}
        name: tmp
      - configMap:
          name: polarsignals-scraper-config
        name: polarsignals-scraper-config
      - secret:
          secretName: polarsignals-agent
        name: token
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    app.kubernetes.io/component: profile-scraper
    app.kubernetes.io/instance: polarsignals-scraper
    app.kubernetes.io/name: polarsignals-scraper
  name: polarsignals-scraper
  namespace: polarsignals

Configuration

The configuration is identical to the Prometheus scrape config. Anything described in the Prometheus documentation can be used to configure the Polar Signals Scraper too.

A common thing to do is not scrape all containers in a Kubernetes cluster, but only a specific container name for example. That can be done by adding the following snippet:

      - source_labels: [__meta_kubernetes_pod_container_name]
        action: keep
        regex: my-container

profiling_config

Overview

The profiling_config block is used to control how profiling data is scraped from applications using pprof-compatible HTTP endpoints. While the configuration and defaults originate from Go, the system is language-agnostic as long as the application exposes profiling data on the standard HTTP paths. Supported languages include:

Go (native net/http/pprof)
Python, Rust, and Node.js (custom exporters that expose data on the same paths)

This configurability allows Polar Signals customers to:

Limit profiling overhead
Collect only the most relevant data
Adapt profiling behavior per environment or language

Basic Structure

- job_name: your-job-name
  profiling_config:
    pprof_config:
      # configuration options here

Default Behavior

By default, the Polar Signals Parca scraper is configured to collect all standard pprof profiles that are typically exposed by Go applications. These include:

Memory profiling via /debug/pprof/allocs
Block profiling via /debug/pprof/block
Goroutine profiling via /debug/pprof/goroutine
Mutex contention profiling via /debug/pprof/mutex
CPU profiling via /debug/pprof/profile

Each of these profiles is enabled by default and uses the corresponding endpoint path. This setup mirrors Go’s standard profiler behavior but works with any language that mimics these endpoints. The CPU profiler is configured to scrape delta values, which is suitable for periodic scraping. You can customize or disable any of these default profiles using the profiling_config block.

Supported `pprof_config` Options

memory Controls how memory profiling data is collected from /debug/pprof/heap or /debug/pprof/allocs. Example:

memory:
  keep_sample_type:
    - type: inuse_space
      unit: bytes

Sample Types:

inuse_space (bytes): Memory currently in use
inuse_objects: Number of live objects
alloc_space: All bytes ever allocated
alloc_objects: Number of total allocated objects

process_cpu Controls scraping from /debug/pprof/profile. Example:

process_cpu:
  enabled: false

Note: Disable this if you're already using the eBPF-based agent from Polar Signals for CPU profiling.

Other Profiles Enable/disable additional profiles:

mutex:
  enabled: true
block:
  enabled: false
goroutine:
  enabled: true

These map to:

/debug/pprof/mutex
/debug/pprof/block
/debug/pprof/goroutine

Custom Profiles (e.g., fgprof)

You can also scrape non-pprof custom profiles such as fgprof, a sampling Go profiler that captures both CPU and wall-clock time.

Example:

pprof_config:
  fgprof:
    enabled: true
    path: /debug/fgprof

This allows you to include completely different profiling data types by configuring them under a custom key.

Best Practices:

Collect only what’s needed: Reduces overhead and storage.
Use in-use memory (inuse_space) over historical allocs.
Avoid scraping CPU if using the eBPF agent.
Split jobs to customize scrape intervals or profiles per target.

Example Minimal Config

- job_name: memory-scraper
  profiling_config:
    pprof_config:
      process_cpu:
        enabled: false
      memory:
        keep_sample_type:
          - type: inuse_space
            unit: bytes

Summary

profiling_config is flexible and works with any language that serves compatible endpoints.
Defaults come from Go, but this can be adapted freely.
Review the Parca repo for up-to-date defaults and implementation details.

Need Help?

Share your configuration with Polar Signals team and we’ll help you optimize for observability, performance, and scale. Feel free to join our Discord server and ask!

Setup Scraping for sending to Polar Signals Cloud

Languages​

Setup​

Configuration​

profiling_config​

Overview​

Basic Structure​

Default Behavior​

Supported pprof_config Options​

Best Practices:​

Summary​

Need Help?​