Distributed Tracing

What it is

Distributed tracing tracks a request from its origin to its destination through all services it touches. Each service generates a "span" with timing and metadata, and all spans are grouped into a "trace" showing the complete flow.

Concepts

Concept	Description	Example
Trace	The complete journey of a request	HTTP request from client to response
Span	An operation within the trace (start, end, metadata)	Database call, service invocation
Context propagation	Passing trace ID between services	`traceparent` header (W3C Trace Context)
Sampling	Not tracing 100% of requests to reduce cost	Head-based (1%), tail-based (errors only)

Flow

Client → API Gateway (span 1)
          → Auth Service (span 2)
          → Product Service (span 3)
             → Database (span 4)
          → Response

Tools

Tool	Type
Jaeger	Open-source (CNCF)
Grafana Tempo	Open-source, Grafana integrated
AWS X-Ray	Managed AWS
Datadog APM	SaaS
OpenTelemetry	Instrumentation standard

Why it matters

In distributed systems, a request traverses multiple services. Without distributed tracing, diagnosing latency or errors is like finding a needle in a haystack. Traces connect the dots between services and reveal where time is being lost.

References

OpenTelemetry Tracing — Official documentation.
Jaeger — CNCF, 2024. Open source distributed tracing system.
Zipkin — OpenZipkin, 2024. Distributed tracing system.

What it is

Concepts

Concept	Description	Example
Trace	The complete journey of a request	HTTP request from client to response
Span	An operation within the trace (start, end, metadata)	Database call, service invocation
Context propagation	Passing trace ID between services	`traceparent` header (W3C Trace Context)
Sampling	Not tracing 100% of requests to reduce cost	Head-based (1%), tail-based (errors only)

Flow

Client → API Gateway (span 1)
          → Auth Service (span 2)
          → Product Service (span 3)
             → Database (span 4)
          → Response

Tools

Tool	Type
Jaeger	Open-source (CNCF)
Grafana Tempo	Open-source, Grafana integrated
AWS X-Ray	Managed AWS
Datadog APM	SaaS
OpenTelemetry	Instrumentation standard

Why it matters

References

OpenTelemetry Tracing — Official documentation.
Jaeger — CNCF, 2024. Open source distributed tracing system.
Zipkin — OpenZipkin, 2024. Distributed tracing system.

Distributed Tracing

What it is

Concepts

Flow

Tools

Why it matters

References

Related content

Distributed Tracing

What it is

Concepts

Flow

Tools

Why it matters

References

Related content