All guides

Filter by category, difficulty, or free text to find the right material for your team.

Logs~40 min

Build incident timelines from logs, metrics, and traces without making up the missing parts

Created: May 2, 2026 · Published: May 2, 2026

Learn how to reconstruct a real incident from Prometheus, Loki, and distributed traces without getting lost in noise, clock skew, or confident storytelling.

LinuxDocker
Intermediate
Read guide
Metrics~40 min

Reduce Prometheus cardinality spikes without blinding your alerts

Created: April 30, 2026 · Published: April 30, 2026

A practical guide to spotting when Prometheus is swelling because of unstable labels, cutting cardinality in the right layer, and validating that your alerts still cover the real incident.

LinuxDocker
Advanced
Read guide
Metrics~40 min

Understand Kubernetes memory metrics without firing false OOM alerts

Created: April 26, 2026 · Published: April 26, 2026

A practical guide to diagnosing Kubernetes container memory with Prometheus and Grafana without confusing usage, working set, RSS, or reclaimable page cache.

LinuxDocker
Intermediate
Read guide
Metrics~35 min

Build useful golden signals for Kubernetes APIs without triggering Prometheus cardinality traps

Created: April 26, 2026 · Published: April 26, 2026

A practical guide to traffic, latency, errors, and saturation for Kubernetes APIs without filling Prometheus with useless series or breaking alerts.

LinuxDocker
Intermediate
Read guide
Reliability~36 min

Design burn rate alerts that do not wake people up for sport

Created: April 22, 2026 · Published: April 22, 2026

Recent Prometheus, OpenTelemetry Collector, Loki, and Alloy releases all point to the same uncomfortable truth: alerts wired straight to raw metrics and fragile labels become noisy or broken very easily. This guide shows how to anchor burn-rate alerts on stable recording rules, validate them with promtool, and roll them out without turning every short spike into a fake emergency.

LinuxDocker
Intermediate
Read guide
Metrics~38 min

Clean up kube-state-metrics noise so your dashboards mean something again

Created: April 20, 2026 · Published: April 20, 2026

kube-state-metrics is still valuable, but in 2026 it exposes more surface area, more stable metrics, and recent defaults such as EndpointSlices. If your dashboards filled up with irrelevant series, fragile joins, or duplicated states, this guide shows how to reduce noise at the source, fix your queries, and validate that the cleanup does not break alerts or troubleshooting.

LinuxDocker
Intermediate
Read guide
Logs~42 min

Build incident timelines with logs, metrics, and traces without making up the story

Created: April 19, 2026 · Published: April 19, 2026

Recent ecosystem signals point in the same direction: more telemetry does not automatically produce better diagnosis. This guide turns that into a repeatable method for building trustworthy timelines with Prometheus, Loki, and OpenTelemetry, while avoiding very current traps such as misleading memory metrics, noisy labels, and traces that lack useful request context.

LinuxDocker
Intermediate
Read guide
Metrics~45 min

Practical golden signals for APIs on Kubernetes without inflating the stack

Created: April 15, 2026 · Published: April 15, 2026

A technical, actionable guide to map golden signals to Prometheus metrics and PromQL, build Grafana panels, create alerts and follow a reproducible troubleshooting flow for Kubernetes APIs without adding unnecessary agents.

LinuxDocker
Intermediate
Read guide
Metrics~60 min

Reducing Prometheus cardinality spikes without breaking alerts

Created: April 11, 2026 · Published: April 11, 2026

A hands-on guide to detect high-cardinality sources, apply safe relabeling and rollups, and confirm critical alerts remain effective.

DockerLinux
Advanced
Read guide
Metrics~32 min

Metric downsampling with VictoriaMetrics in the free version

Created: April 10, 2026 · Published: April 10, 2026

VictoriaMetrics Enterprise provides native downsampling in cluster. On the free tier, you can approximate it with separate clusters, fan-out, and `-dedup.minScrapeInterval`.

Advanced
Read guide
Metrics~16 min

Prometheus for system metrics (cross-platform)

Created: April 5, 2026 · Published: April 5, 2026

Deploy Prometheus and validate operational metrics with a reproducible workflow, independent of your operating system.

Docker
Beginner
Read guide