Monitoring Docker

Concepts

A short orientation to how the Monitoring Docker solution actually works. Read this once and the rest of the docs will make sense without piecing things together from installation, configuration, and labels all at once.

Two pieces, one product

The product ships in two halves that share a version number:

  • Collectord — the agent. Runs as a single container on each Docker host, watching the Docker daemon for new containers and forwarding their logs and metrics to Splunk over HEC.
  • Monitoring Docker — the Splunk app. Dashboards, alerts, search macros, field extractions. Reads what Collectord forwarded; doesn’t talk to the Docker daemon directly.

When you upgrade, you update both at the same version. The release notes cover Splunk-app changes first, then Collectord updates:.

What Collectord forwards

Collectord covers a broad range of data sources, configured via [input.*] sections. The categories you’ll see in Splunk:

  • Logs — container stdout / stderr, application logs from files inside the container or on a mounted volume, host logs (syslog, journald).
  • Metrics — host, container, and process stats from cgroups and /proc; network and socket-table metrics; mount and disk metrics; Prometheus metrics scraped from your containers; and Collectord’s own internal metrics.
  • Events — Docker daemon events.
  • Objects — container and image specs polled from the Docker API on a schedule (opt-in — see Object polling).

Each input has its own sourcetype (docker_logs, docker_stats, docker_proc_stats, docker_net_stats, docker_events, docker_objects, docker_prometheus, …) and a default index. The full list of inputs and types lives in the Configuration reference.

Inputs, outputs, and pipes

Internally, Collectord is a pipeline of three concepts:

  • An input discovers data — [input.files] walks /var/log, [input.docker_events] watches the daemon, [input.prometheus_auto] scrapes containers labeled with a metrics endpoint.
  • A pipe transforms events on the way through — replace, hash, extract, override, sample, throttle.
  • An output ships the result somewhere — [output.splunk] is the default; devnull exists for dropping data; you can define multiple Splunk outputs and route different inputs to each.

Most of container annotations is just “configure pipes and the destination output, scoped to one container.”

Where configuration comes from

The same setting can be specified in several places. The general layering, highest priority first:

  1. Container labels — most specific. Both collectord.io/{label}={value} and io.collectord.{label}={value} forms are accepted.
  2. Environment variables on the Collectord container--env COLLECTOR__X="<section>__<key>=<value>" overrides a single config value without rebuilding the image.
  3. Mounted config files — additional *.conf files mounted into /config/ on the Collectord container. Collectord reads everything matching /config/*.conf in alphabetical order, with later files overriding earlier ones.
  4. Default 001-general.conf — the configuration shipped inside the Collectord image at /config/001-general.conf. The lowest-priority layer.

When a label isn’t behaving the way you expect, run collectord describe from inside the Collectord container — see Troubleshooting → Describe.

Auto-discovery vs explicit opt-in

Some things Collectord finds on its own; others require you to point at them:

Picked up automaticallyRequires a label or config
Container stdout / stderrApplication log files inside containers
Host, container, process metricsFiles on a host mount or shared volume
Host logs (syslog, journald)Container and image specs polled from the API (Object polling)
Docker daemon eventsPrometheus endpoints exposed by your containers
Custom field extractions on container logs

If something you expect isn’t in Splunk, the first question is usually: was this auto-discovered, or do I need to point Collectord at it?

How data flows

text
1Your container ─┐
2                ├─► [input] ──► [pipe] ──► [output] ──► Splunk HEC ──► Splunk indexer
3Your label    ──┘                                                          │
45Splunk role  ◄── [Monitoring Docker app: dashboards, alerts, macros] ◄─────┘

Collectord runs on each Docker host, reads from local sources (Docker socket, filesystem, cgroups, /proc), and pushes events to your Splunk HTTP Event Collector. The Splunk app on the search head reads them back through macros scoped to the right indexes.

Where to go next