← Back to docsRecipe

Distributed Tracing

Follow a single request as it hops across services, queues, and databases. Distributed tracing gives you a unified timeline of every span so you can pinpoint latency, errors, and bottlenecks without grepping scattered logs.

Core concepts

Trace — the full end-to-end journey of a request across all services.
Span — a single unit of work with a start time, duration, and parent reference.
Context propagation — passing trace IDs across HTTP headers, message brokers, or gRPC metadata.

How it works

An incoming request generates a unique trace-id.
Each service creates a span with its own span-id and the parent span reference.
Spans are batched and exported to a collector (OTLP, Jaeger, Zipkin).
The collector assembles the trace tree and surfaces flame graphs, waterfall views, and latency heatmaps.

Key benefits

Latency attribution

Instantly see which service or database query consumed the most time.

Error root cause

Trace errors back to the exact span where the failure originated.

Dependency map

Auto-generated topology of every service-to-service call.

SLO monitoring

Alert on p95 latency thresholds derived from real trace data.

Ready to instrument your stack? Check the getting started guide for OpenTelemetry SDK setup.