Distributed Tracing
Follow a single request as it hops across services, queues, and databases. Distributed tracing gives you a unified timeline of every span so you can pinpoint latency, errors, and bottlenecks without grepping scattered logs.
Core concepts
- Trace — the full end-to-end journey of a request across all services.
- Span — a single unit of work with a start time, duration, and parent reference.
- Context propagation — passing trace IDs across HTTP headers, message brokers, or gRPC metadata.
How it works
- An incoming request generates a unique
trace-id. - Each service creates a span with its own
span-idand the parent span reference. - Spans are batched and exported to a collector (OTLP, Jaeger, Zipkin).
- The collector assembles the trace tree and surfaces flame graphs, waterfall views, and latency heatmaps.
Key benefits
Latency attribution
Instantly see which service or database query consumed the most time.
Error root cause
Trace errors back to the exact span where the failure originated.
Dependency map
Auto-generated topology of every service-to-service call.
SLO monitoring
Alert on p95 latency thresholds derived from real trace data.
Ready to instrument your stack? Check the getting started guide for OpenTelemetry SDK setup.