Distributed tracing experiments

This document’s purpose is to explain our current tracing capabilities and experimentation that we are running for Atlas and for some customers.

Table of contents

Definitions

To have a good understanding of distributed tracing, reading the opentelemetry manual is strongly recommended.

Grafana tempo

As an experiment for customers and ourselves, we started to allow the use of Grafana Tempo on Vintage AWS management clusters.

Grafana Tempo is a trace backend used to store distributed traces and is meant to be accessed by grafana. It can be deployed on demand by customers and internal teams alike if needed.

Accessing Tempo

Open grafana on the installation you’re interested in and go to the Explore section. There, you can use the Tempo datasource and start going through traces.

example of Loki traces stored in Tempo traces accessed through grafana

Inner workings

Tempo currently does not work with multi-tenancy so traces are shared between all grafana users.

Tempo is deployed in microservice mode and it can ingest traces either through the Tempo gateway or the Tempo distributor. The main difference between the two is that the tempo-gateway only supports http(s) communication whereas the distributor supports also gRPC.

For our initial experiment, we decided to work without any “tracing agent” like the opentelemetry-collector to reduce our footprint and effort but this might come in the future.

On-demand

It can be enabled at the installation level like so https://github.com/giantswarm/config/blob/4e01cbc6de32420d111d78e0e278e72a31e5daba/installations/gaia/config.yaml.patch#L9

Ingesting traces

Loki

If tracing is enabled at the installation level, Loki running on the MC is configured to send its traces to Tempo

Prometheus

  • To enable tracing for Prometheus, you should set the following configuration on the Prometheus CR:
spec: tracingConfig: clientType: http endpoint: tempo-gateway.tempo.svc:80 # Tempo service on the MC insecure: false samplingFraction: "0.5" # sampling of 1/2 so we only send half of the traces

Autoinstrumentation with eBPF

Distributed tracing is not a feature that comes by default in applications. They need to be instrumented by opentelemetry SDKs to be able to export traces.

But there are cases where it is not possible to get traces from applications (blackbox or vendored applications, languages not equipped with opentelemetry SDKs like R). This is a really big blind spot in distributed tracing. The good news is that we can now rely on eBPF to get some basic traces from our applications. After experimenting with a few different tools like odigos, we tried out Grafana Beyla after it was released in version 1.0.0 and it looks to be the most mature solution (not relying on a lot of things that can fail and actually exporting traces from apps)

We ran it as an experiment for a customer but if the need arises, the following shows how we deployed it to trace an application written in R alongside an opentelemetry collector:

apiVersion: v1 kind: Namespace metadata: labels: app: beyla name: beyla --- apiVersion: v1 kind: ConfigMap metadata: labels: app: beyla name: beyla-config namespace: beyla data: beyla-config.yml: | grafana: oltp: submit: ["traces"] otel_traces_export: sampler: name: parentbased_traceidratio arg: 1 discovery: services: - namespace: citadel open_ports: 2000-15000 --- apiVersion: v1 kind: ServiceAccount metadata: name: beyla namespace: beyla labels: app: beyla --- apiVersion: apps/v1 kind: DaemonSet metadata: name: beyla namespace: beyla labels: app: beyla spec: selector: matchLabels: app: beyla template: metadata: labels: app: beyla spec: serviceAccountName: beyla tolerations: - key: risk-perf operator: Exists - key: importer operator: Exists - key: internal-raptor operator: Exists hostPID: true # Require to access the processes on the host nodeSelector: kubernetes.io/os: linux volumes: - name: beyla-config configMap: name: beyla-config containers: - name: autoinstrument image: grafana/beyla:1.0 command: ["/beyla", "--config=/config/beyla-config.yml"] securityContext: runAsUser: 0 privileged: true # Alternative to the capabilities.add SYS_ADMIN setting env: - name: OTEL_EXPORTER_OTLP_ENDPOINT value: "http://my-opentelemetry-collector.default.svc:4317" - name: BEYLA_LOG_LEVEL value: DEBUG volumeMounts: - mountPath: /config name: beyla-config --- kind: ClusterRoleBinding apiVersion: rbac.authorization.k8s.io/v1 metadata: name: beyla:privileged namespace: beyla subjects: - kind: ServiceAccount name: beyla namespace: beyla roleRef: kind: ClusterRole name: privileged-psp-user apiGroup: rbac.authorization.k8s.io

Future

Ideas for the future:

  • Deploy tempo by default on all MCs
  • Opentelemetry agent in all clusters either using opentelemetry-operator, opentelemetry-collector or grafana-agent