Observability

What it is

Observability is the ability to understand what's happening inside a system based on the data it produces. Unlike monitoring (which checks known conditions), observability enables investigating unknown problems.

The three pillars

Logs

Textual event records:

Structured logging (JSON) for efficient search
Levels: DEBUG, INFO, WARN, ERROR
Correlation with trace IDs

Metrics

Numerical measurements aggregated over time:

Counters: values that only increment
Gauges: values that go up and down
Histograms: value distribution

Traces

Request tracking through distributed services:

Span: unit of work
Trace: set of related spans
Context propagation: passing trace ID between services

OpenTelemetry

CNCF standard unifying logs, metrics, and traces instrumentation with SDKs for all major languages.

Tools

Tool	Type
Grafana	Dashboards
Prometheus	Metrics
Jaeger/Tempo	Traces
Loki	Logs
Datadog	All-in-one
AWS CloudWatch	AWS native

Why it matters

Observability is what enables understanding a system's behavior in production without predicting in advance what questions you will need to answer. Unlike traditional monitoring, which checks known conditions, observability enables investigating the unknown.

References

OpenTelemetry — Observability standard.
Observability Engineering — Charity Majors et al.
OpenTelemetry Documentation — OpenTelemetry, 2024. Complete standard documentation.

What it is

The three pillars

Logs

Textual event records:

Structured logging (JSON) for efficient search
Levels: DEBUG, INFO, WARN, ERROR
Correlation with trace IDs

Metrics

Numerical measurements aggregated over time:

Counters: values that only increment
Gauges: values that go up and down
Histograms: value distribution

Traces

Request tracking through distributed services:

Span: unit of work
Trace: set of related spans
Context propagation: passing trace ID between services

OpenTelemetry

CNCF standard unifying logs, metrics, and traces instrumentation with SDKs for all major languages.

Tools

Tool	Type
Grafana	Dashboards
Prometheus	Metrics
Jaeger/Tempo	Traces
Loki	Logs
Datadog	All-in-one
AWS CloudWatch	AWS native

Why it matters

References

OpenTelemetry — Observability standard.
Observability Engineering — Charity Majors et al.
OpenTelemetry Documentation — OpenTelemetry, 2024. Complete standard documentation.

Observability

What it is

The three pillars

Logs

Metrics

Traces

OpenTelemetry

Tools

Why it matters

References

Related content

Observability

What it is

The three pillars

Logs

Metrics

Traces

OpenTelemetry

Tools

Why it matters

References

Related content