Jonatan Matajonmatum.com
conceptsnotesexperimentsessays
© 2026 Jonatan Mata. All rights reserved.v2.1.1
Concepts

Distributed Tracing

Observability technique tracking requests across multiple services in distributed systems, enabling bottleneck identification and failure diagnosis.

seed#tracing#distributed#opentelemetry#jaeger#spans#observability

What it is

Distributed tracing tracks a request from its origin to its destination through all services it touches. Each service generates a "span" with timing and metadata, and all spans are grouped into a "trace" showing the complete flow.

Concepts

ConceptDescriptionExample
TraceThe complete journey of a requestHTTP request from client to response
SpanAn operation within the trace (start, end, metadata)Database call, service invocation
Context propagationPassing trace ID between servicestraceparent header (W3C Trace Context)
SamplingNot tracing 100% of requests to reduce costHead-based (1%), tail-based (errors only)

Flow

Client → API Gateway (span 1)
          → Auth Service (span 2)
          → Product Service (span 3)
             → Database (span 4)
          → Response

Tools

ToolType
JaegerOpen-source (CNCF)
Grafana TempoOpen-source, Grafana integrated
AWS X-RayManaged AWS
Datadog APMSaaS
OpenTelemetryInstrumentation standard

Why it matters

In distributed systems, a request traverses multiple services. Without distributed tracing, diagnosing latency or errors is like finding a needle in a haystack. Traces connect the dots between services and reveal where time is being lost.

References

  • OpenTelemetry Tracing — Official documentation.
  • Jaeger — CNCF, 2024. Open source distributed tracing system.
  • Zipkin — OpenZipkin, 2024. Distributed tracing system.

Related content

  • Observability

    Ability to understand a system's internal state from its external outputs: logs, metrics, and traces, enabling problem diagnosis without direct system access.

  • Microservices

    Architectural style structuring an application as a collection of small, independent, deployable services, each with its own business logic and data.

  • AI Observability

    Practices and tools for monitoring, tracing, and debugging AI systems in production, covering token metrics, latency, response quality, costs, and hallucination detection.

  • Saga Pattern

    Pattern for managing distributed transactions in microservices through a sequence of local transactions with compensating actions to handle failures.

  • Logging Strategies

    Practices for implementing effective logging in distributed systems: structured logging, levels, correlation, and centralized aggregation.

Concepts