Setup Scraping for sending to Polar Signals Cloud
While the primary way to write data to Polar Signals Cloud is to use the agent, there may be reasons to want to write additional data that cannot be obtained via the agent.
Languages
Here is a table of Languages/runtimes and libraries that can be used to gather additional types of profiling data.
Language/runtime | Heap | Allocations | Blocking | Mutex Contention | Extra |
---|---|---|---|---|---|
Go | Yes | Yes | Yes | Yes | Goroutine, fgprof |
Python | Yes | No | No | No | |
NodeJS | Yes | No | No | No | |
Rust (requires using jemalloc as the allocator) | Yes | No | No | No |
Setup
The below manifests deploy the Polar Signals scraper, which discovers containers in a Kubernetes cluster and starts scraping them.
The configuration is identical to the Prometheus scrape config, Kubernetes is just used as an example here.
kubectl apply -f polarsignals-scraper.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: polarsignals
labels:
pod-security.kubernetes.io/audit: privileged
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/warn: privileged
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
app.kubernetes.io/component: profile-scraper
app.kubernetes.io/instance: polarsignals-scraper
app.kubernetes.io/name: polarsignals-scraper
name: polarsignals-scraper
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
labels:
app.kubernetes.io/component: profile-scraper
app.kubernetes.io/instance: polarsignals-scraper
app.kubernetes.io/name: polarsignals-scraper
name: polarsignals-scraper
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: polarsignals-scraper
subjects:
- kind: ServiceAccount
name: polarsignals-scraper
namespace: polarsignals
---
kind: ConfigMap
metadata:
name: polarsignals-scraper-config
namespace: polarsignals
apiVersion: v1
data:
polarsignals-scraper.yaml: |-
object_storage:
bucket:
type: "FILESYSTEM"
config:
directory: "./data"
# MODIFY THIS TO YOUR NEEDS
scrape_configs:
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: labelmap
regex: __meta_kubernetes_pod_label_app_kubernetes_io_(.+)
replacement: "app_kubernetes_io_$1"
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
- source_labels: [__meta_kubernetes_pod_container_name]
action: replace
target_label: container
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/component: profile-scraper
app.kubernetes.io/instance: polarsignals-scraper
app.kubernetes.io/name: polarsignals-scraper
name: polarsignals-scraper
namespace: polarsignals
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/component: profile-scraper
app.kubernetes.io/instance: polarsignals-scraper
app.kubernetes.io/name: polarsignals-scraper
template:
metadata:
labels:
app.kubernetes.io/component: profile-scraper
app.kubernetes.io/instance: polarsignals-scraper
app.kubernetes.io/name: polarsignals-scraper
app.kubernetes.io/version: v0.20.0
spec:
securityContext:
fsGroup: 65534
runAsUser: 65534
containers:
- args:
- /parca
- --config-path=/var/polarsignals-scraper/polarsignals-scraper.yaml
- --store-address=grpc.polarsignals.com:443
- --bearer-token-file=/var/polarsignals-secret/token
- --mode=scraper-only
image: ghcr.io/parca-dev/parca:v0.20.0
livenessProbe:
exec:
command:
- /grpc_health_probe
- -v
- -addr=:7070
initialDelaySeconds: 5
name: polarsignals-scraper
ports:
- containerPort: 7070
name: http
readinessProbe:
exec:
command:
- /grpc_health_probe
- -v
- -addr=:7070
initialDelaySeconds: 10
terminationMessagePolicy: FallbackToLogsOnError
resources: {}
volumeMounts:
- mountPath: /tmp
name: tmp
- mountPath: /var/polarsignals-scraper
name: polarsignals-scraper-config
- mountPath: /var/polarsignals-secret
name: token
nodeSelector:
kubernetes.io/os: linux
serviceAccountName: polarsignals-scraper
tolerations:
- effect: NoSchedule
operator: Exists
- effect: NoExecute
operator: Exists
volumes:
- emptyDir: {}
name: tmp
- configMap:
name: polarsignals-scraper-config
name: polarsignals-scraper-config
- secret:
secretName: polarsignals-agent
name: token
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
app.kubernetes.io/component: profile-scraper
app.kubernetes.io/instance: polarsignals-scraper
app.kubernetes.io/name: polarsignals-scraper
name: polarsignals-scraper
namespace: polarsignals
Configuration
The configuration is identical to the Prometheus scrape config. Anything described in the Prometheus documentation can be used to configure the Polar Signals Scraper too.
A common thing to do is not scrape all containers in a Kubernetes cluster, but only a specific container name for example. That can be done by adding the following snippet:
- source_labels: [__meta_kubernetes_pod_container_name]
action: keep
regex: my-container
profiling_config
Overview
The profiling_config block is used to control how profiling data is scraped from applications using pprof-compatible HTTP endpoints. While the configuration and defaults originate from Go, the system is language-agnostic as long as the application exposes profiling data on the standard HTTP paths. Supported languages include:
- Go (native net/http/pprof)
- Python, Rust, and Node.js (custom exporters that expose data on the same paths)
This configurability allows Polar Signals customers to:
- Limit profiling overhead
- Collect only the most relevant data
- Adapt profiling behavior per environment or language
Basic Structure
- job_name: your-job-name
profiling_config:
pprof_config:
# configuration options here
Default Behavior
By default, the Polar Signals Parca scraper is configured to collect all standard pprof profiles that are typically exposed by Go applications. These include:
- Memory profiling via
/debug/pprof/allocs
- Block profiling via
/debug/pprof/block
- Goroutine profiling via
/debug/pprof/goroutine
- Mutex contention profiling via
/debug/pprof/mutex
- CPU profiling via
/debug/pprof/profile
Each of these profiles is enabled by default and uses the corresponding endpoint path. This setup mirrors Go’s standard profiler behavior but works with any language that mimics these endpoints. The CPU profiler is configured to scrape delta values, which is suitable for periodic scraping.
You can customize or disable any of these default profiles using the profiling_config
block.
Supported pprof_config
Options
- memory
Controls how memory profiling data is collected from
/debug/pprof/heap
or/debug/pprof/allocs
. Example:
memory:
keep_sample_type:
- type: inuse_space
unit: bytes
Sample Types:
inuse_space
(bytes): Memory currently in useinuse_objects
: Number of live objectsalloc_space
: All bytes ever allocatedalloc_objects
: Number of total allocated objects
- process_cpu
Controls scraping from
/debug/pprof/profile
. Example:
process_cpu:
enabled: false
Note: Disable this if you're already using the eBPF-based agent from Polar Signals for CPU profiling.
- Other Profiles Enable/disable additional profiles:
mutex:
enabled: true
block:
enabled: false
goroutine:
enabled: true
These map to:
/debug/pprof/mutex
/debug/pprof/block
/debug/pprof/goroutine
- Custom Profiles (e.g., fgprof)
You can also scrape non-pprof custom profiles such as fgprof, a sampling Go profiler that captures both CPU and wall-clock time.
Example:
pprof_config:
fgprof:
enabled: true
path: /debug/fgprof
This allows you to include completely different profiling data types by configuring them under a custom key.
Best Practices:
- Collect only what’s needed: Reduces overhead and storage.
- Use in-use memory (inuse_space) over historical allocs.
- Avoid scraping CPU if using the eBPF agent.
- Split jobs to customize scrape intervals or profiles per target.
Example Minimal Config
- job_name: memory-scraper
profiling_config:
pprof_config:
process_cpu:
enabled: false
memory:
keep_sample_type:
- type: inuse_space
unit: bytes
Summary
profiling_config
is flexible and works with any language that serves compatible endpoints.- Defaults come from Go, but this can be adapted freely.
- Review the Parca repo for up-to-date defaults and implementation details.
Need Help?
Share your configuration with Polar Signals team and we’ll help you optimize for observability, performance, and scale. Feel free to join our Discord server and ask!