Monitoring Docker

Container annotations

Container labels are how you tell Collectord to handle one container differently from the rest — change its target index, point at log files inside the container, mask sensitive values, extract fields, fix multi-line stack traces, or scrape Prometheus endpoints — without touching the global configuration. The full list of every label lives in the Container labels reference.

Collectord accepts labels in two equivalent forms: collectord.io/{annotation}={value} and io.collectord.{annotation}={value}.

Starting from version 5.9 you can scope annotations to a specific Collectord instance with [general]annotationsSubdomain — useful when you run more than one Collectord on the same host. Annotations under collectord.collectord.io/{annotation} apply to every Collectord instance regardless of subdomain.

Overriding indexes

Use these labels when you want a container’s data to land in a different Splunk index than the host default — for example, giving a project its own index for chargeback or access control. The catch-all is collectord.io/index. For finer control, target a specific datatype: collectord.io/logs-index for container logs, collectord.io/stats-index for container metrics, and collectord.io/procstats-index for process stats.

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-index=project1_logs' \
4       --label 'collectord.io/stats-index=project1_stats' \
5       --label 'collectord.io/procstats-index=project1_stats' \
6       --label 'collectord.io/netstats-index=project1_stats' \
7       --label 'collectord.io/nettable-index=project1_stats' \
8       nginx

source, type, and host follow the same pattern — collectord.io/source, collectord.io/logs-source, and so on.

Overriding index, source and type for specific events

Available since Collectord version 5.2

When a single container produces multiple kinds of log lines — say, an nginx container writing both access logs and error logs to the same stream — you can split them at ingest time using override pipes. Each override pipe matches a regex and rewrites source, type, or index only for matching events.

For an nginx container writing:

text
1172.17.0.1 - - [12/Oct/2018:22:38:05 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
22018/10/12 22:38:15 [error] 8#8: *2 open() "/usr/share/nginx/html/a.txt" failed (2: No such file or directory), client: 172.17.0.1, server: localhost, request: "GET /a.txt HTTP/1.1", host: "localhost:32768"
3172.17.0.1 - - [12/Oct/2018:22:38:15 +0000] "GET /a.txt HTTP/1.1" 404 153 "-" "curl/7.54.0" "-"

To send only the access-log lines (those starting with an IPv4 address) to a custom source:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-override.1-match=^(\d{1,3}\.){3}\d{1,3}' \
4       --label 'collectord.io/logs-override.1-source=/docker/nginx/web-log' \
5       nginx

The error-log line keeps the default container-log source; everything matching the IP regex gets the new one:

text
1source                 | event
2------------------------------------------------------------------------------------------------------------------------
3/docker/nginx/web-log  | 172.17.0.1 - - [12/Oct/2018:22:38:05 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
4/docker/550...stderr   | 2018/10/12 22:38:15 [error] 8#8: *2 open() "/usr/share/nginx/html/a.txt" failed (2: No such file or directory), client: 172.17.0.1, server: localhost, request: "GET /a.txt HTTP/1.1", host: "localhost:32768"
5/docker/nginx/web-log  | 172.17.0.1 - - [12/Oct/2018:22:38:15 +0000] "GET /a.txt HTTP/1.1" 404 153 "-" "curl/7.54.0" "-"

Replace patterns in events

Replace pipes let you rewrite parts of a log line before it reaches Splunk — useful for masking sensitive data (PII, tokens, IPs) or stripping noise. Each pipe is a pair of labels grouped by number: collectord.io/logs-replace.{N}-search is the regex, collectord.io/logs-replace.{N}-val is the replacement. Pipes apply in numeric order (replace.1 before replace.2), so you can chain them. Use $1 or ${name} in the replacement to reference capture groups.

Collectord uses Go’s regexp library — see Package regexp and re2 syntax. regex101.com is great for testing (set the Flavor to golang).

Throughout the examples below we use nginx access logs:

text
1172.17.0.1 - - [31/Aug/2018:21:11:26 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
2172.17.0.1 - - [31/Aug/2018:21:11:32 +0000] "POST / HTTP/1.1" 405 173 "-" "curl/7.54.0" "-"
3172.17.0.1 - - [31/Aug/2018:21:11:35 +0000] "GET /404 HTTP/1.1" 404 612 "-" "curl/7.54.0" "-"

Example 1. Replacing IPv4 addresses with X.X.X.X

To fully mask client IPs:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-replace.1-search=(\d{1,3}\.){3}\d{1,3}' \
4       --label 'collectord.io/logs-replace.1-val=X.X.X.X' \
5       nginx

Splunk receives:

text
1X.X.X.X - - [31/Aug/2018:21:11:26 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
2X.X.X.X - - [31/Aug/2018:21:11:32 +0000] "POST / HTTP/1.1" 405 173 "-" "curl/7.54.0" "-"
3X.X.X.X - - [31/Aug/2018:21:11:35 +0000] "GET /404 HTTP/1.1" 404 612 "-" "curl/7.54.0" "-"

If you need to preserve the first octet — common for partial geolocation while still anonymizing — capture it with a named group:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-replace.1-search=(?P<IPv4p1>\d{1,3})(\.\d{1,3}){3}' \
4       --label 'collectord.io/logs-replace.1-val=${IPv4p1}.X.X.X' \
5       nginx

Result:

text
1172.X.X.X - - [31/Aug/2018:21:11:26 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
2172.X.X.X - - [31/Aug/2018:21:11:32 +0000] "POST / HTTP/1.1" 405 173 "-" "curl/7.54.0" "-"
3172.X.X.X - - [31/Aug/2018:21:11:35 +0000] "GET /404 HTTP/1.1" 404 612 "-" "curl/7.54.0" "-"

Example 2. Dropping messages

Replacing a whole line with the empty string drops the event entirely. Below, we drop noisy successful GET requests, and then mask IPs on whatever’s left:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-replace.1-search=^.+\"GET [^\s]+ HTTP/[^"]+" 200 .+$' \
4       --label 'collectord.io/logs-replace.1-val=' \
5       --label 'collectord.io/logs-replace.2-search=(\d{1,3}\.){3}\d{1,3}' \
6       --label 'collectord.io/logs-replace.2-val=X.X.X.X' \
7       nginx

Pipes apply in alphabetical order — replace.1 drops the success line first, then replace.2 masks IPs on the remaining errors:

text
1X.X.X.X - - [31/Aug/2018:21:11:32 +0000] "POST / HTTP/1.1" 405 173 "-" "curl/7.54.0" "-"
2X.X.X.X - - [31/Aug/2018:21:11:35 +0000] "GET /404 HTTP/1.1" 404 612 "-" "curl/7.54.0" "-"

Example 3. Whitelisting the messages

When the logs you care about are a small subset of total volume, it’s easier to whitelist than blacklist. With collectord.io/logs-whitelist, only lines matching the regex are forwarded — everything else is dropped:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-whitelist=((DELETE)|(POST))' \
4       nginx

Hashing values in logs

Available since Collectord version 5.3

When you need to correlate events by a sensitive field but can’t store the raw value, hash it instead of replacing it. Hashed values are still consistent across events — searching for the hash of a known IP will find every line containing that IP, but the IP itself never reaches Splunk.

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-hashing.1-match=(\d{1,3}\.){3}\d{1,3}' \
4       --label 'collectord.io/logs-hashing.1-function=fnv-1a-64' \
5       nginx

A line that originally read:

text
1172.17.0.1 - - [16/Nov/2018:11:17:17 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"

becomes, with fnv-1a-64:

text
1gqsxydjtZL4 - - [16/Nov/2018:11:17:17 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"

Collectord supports both fast non-cryptographic hashes (FNV, CRC, Adler) and cryptographic ones (MD5, SHA family). Pick the cheapest one that meets your security requirements — non-cryptographic hashes are fine for correlation but should not be relied on for security. Benchmarks below are nanoseconds per operation, hashing two IP addresses in source: 127.0.0.1, destination: 10.10.1.99:

text
 1| Function          | ns / op |
 2-------------------------------
 3| adler-32          |    1713 |
 4| crc-32-ieee       |    1807 |
 5| crc-32-castagnoli |    1758 |
 6| crc-32-koopman    |    1753 |
 7| crc-64-iso        |    1739 |
 8| crc-64-ecma       |    1740 |
 9| fnv-1-64          |    1711 |
10| fnv-1a-64         |    1711 |
11| fnv-1-32          |    1744 |
12| fnv-1a-32         |    1738 |
13| fnv-1-128         |    1852 |
14| fnv-1a-128        |    1836 |
15| md5               |    2032 |
16| sha1              |    2037 |
17| sha256            |    2220 |
18| sha384            |    2432 |
19| sha512            |    2516 |

Escaping terminal sequences, including terminal colors

Containers attached to a TTY often emit ANSI color codes that look like garbage in Splunk:

bash
1docker run -it ubuntu ls --color=auto /
2bin   dev  home  lib64  mnt  proc  run   srv  tmp  var
3boot  etc  lib   media  opt  root  sbin  sys  usr

Without intervention, Splunk shows:

text
1[01;34mboot[0m  [01;34metc[0m  [01;34mlib[0m   [01;34mmedia[0m  [01;34mopt[0m  [01;34mroot[0m  [01;34msbin[0m  [01;34msys[0m  [01;34musr[0m
2[0m[01;34mbin[0m   [01;34mdev[0m  [01;34mhome[0m  [01;34mlib64[0m  [01;34mmnt[0m  [01;34mproc[0m  [01;34mrun[0m   [01;34msrv[0m  [30;42mtmp[0m  [01;34mvar[0m

Add collectord.io/logs-escapeterminalsequences=true and Collectord strips them before forwarding:

bash
1docker run -it \
2    --label 'collectord.io/logs-escapeterminalsequences=true' \
3    ubuntu ls --color=auto /

Now Splunk shows clean output:

text
1bin   dev  home  lib64  mnt  proc  run   srv  tmp  var
2boot  etc  lib   media  opt  root  sbin  sys  usr

If most of your containers emit color codes, flip the global default — [input.files]/stripTerminalEscapeSequences controls whether Collectord strips them by default (defaults to false), and [input.files]/stripTerminalEscapeSequencesRegex controls which sequences match.

Extracting fields from the container logs

Field extraction at ingest time pulls structured values out of unstructured log lines — the timestamp, an IP address, a request path — and indexes them so Splunk can search them as fields rather than scanning _raw. This makes searches dramatically faster on high-volume indexes.

We’ll keep using nginx access logs for the examples:

bash
1172.17.0.1 - - [31/Aug/2018:21:11:26 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
2172.17.0.1 - - [31/Aug/2018:21:11:32 +0000] "POST / HTTP/1.1" 405 173 "-" "curl/7.54.0" "-"
3172.17.0.1 - - [31/Aug/2018:21:11:35 +0000] "GET /404 HTTP/1.1" 404 612 "-" "curl/7.54.0" "-"

By default, the first unnamed capture group becomes the event message (_raw). Override that with collectord.io/logs-extractionMessageField (5.18+) to pick a different group as the message.

Example 1. Extracting the timestamp

When the container’s own timestamp is more accurate than ingest time (clock skew, batched logs, replay), extract it and use it as _time. Specify the regex, the named group containing the timestamp, and the format.

Collectord uses Go’s time parser, which uses the reference date Mon Jan 2 15:04:05 MST 2006 to describe formats — see Go documentation.

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-extraction=^(.*\[(?P<timestamp>[^\]]+)\].+)$' \
4       --label 'collectord.io/logs-timestampfield=timestamp' \
5       --label 'collectord.io/logs-timestampformat=02/Jan/2006:15:04:05 -0700' \
6       nginx

The event’s _time in Splunk now matches the timestamp inside the log line.

Available since Collectord version 5.24.440 For unix epoch timestamps, use the format @unixtimestamp.

Example 2. Extracting the fields

Once you’ve moved the timestamp to _time, you usually don’t want it duplicated in _raw. Extract additional fields and let the rest fall into the message:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-extraction=^(?P<ip_address>[^\s]+) .* \[(?P<timestamp>[^\]]+)\] (.+)$' \
4       --label 'collectord.io/logs-timestampfield=timestamp' \
5       --label 'collectord.io/logs-timestampformat=02/Jan/2006:15:04:05 -0700' \
6       nginx

Splunk now has ip_address as an indexed field, the parsed _time, and a tighter _raw:

text
1ip_address | _time               | _raw
2-----------|---------------------|-------------------------------------------------
3172.17.0.1 | 2018-08-31 21:11:26 | "GET / HTTP/1.1" 200 612 "-" "curl/7.54.0" "-"
4172.17.0.1 | 2018-08-31 21:11:32 | "POST / HTTP/1.1" 405 173 "-" "curl/7.54.0" "-"
5172.17.0.1 | 2018-08-31 21:11:35 | "GET /404 HTTP/1.1" 404 612 "-" "curl/7.54.0" "-"

Defining Event pattern

collectord.io/logs-eventpattern controls how Collectord decides where one log event ends and the next begins. The default in collectord configuration is ^[^\s] — any line that doesn’t start with whitespace begins a new event. That handles most stack traces (where continuation lines are indented), but breaks for log formats where continuation lines start in column 0.

A common case is Java/Elasticsearch errors where the call stack header doesn’t begin with whitespace. Below, we deliberately misconfigure Elasticsearch (s-node instead of single-node) to get a multi-line stack trace:

bash
1docker run --env "discovery.type=s-node" docker.elastic.co/elasticsearch/elasticsearch:6.4.0

The output looks like:

text
 1[2018-08-31T22:44:56,433][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/92] [Main.cc@109] controller (64 bit): Version 6.4.0 (Build cf8246175efff5) Copyright (c) 2018 Elasticsearch BV
 2[2018-08-31T22:44:56,886][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
 3org.elasticsearch.bootstrap.StartupException: java.lang.IllegalArgumentException: Unknown discovery type [s-node]
 4	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:140) ~[elasticsearch-6.4.0.jar:6.4.0]
 5	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:127) ~[elasticsearch-6.4.0.jar:6.4.0]
 6	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.4.0.jar:6.4.0]
 7	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.4.0.jar:6.4.0]
 8	at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.4.0.jar:6.4.0]
 9	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.4.0.jar:6.4.0]
10	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:86) ~[elasticsearch-6.4.0.jar:6.4.0]
11Caused by: java.lang.IllegalArgumentException: Unknown discovery type [s-node]
12	at org.elasticsearch.discovery.DiscoveryModule.<init>(DiscoveryModule.java:129) ~[elasticsearch-6.4.0.jar:6.4.0]
13	at org.elasticsearch.node.Node.<init>(Node.java:477) ~[elasticsearch-6.4.0.jar:6.4.0]
14	at org.elasticsearch.node.Node.<init>(Node.java:256) ~[elasticsearch-6.4.0.jar:6.4.0]
15	at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.4.0.jar:6.4.0]
16	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.4.0.jar:6.4.0]
17	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:326) ~[elasticsearch-6.4.0.jar:6.4.0]
18	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136) ~[elasticsearch-6.4.0.jar:6.4.0]
19	... 6 more
20[2018-08-31T22:44:56,892][INFO ][o.e.x.m.j.p.NativeController] Native controller process has stopped - no new native processes can be started

With the default pattern, the warning line [2018-08-31T22:44:56,886][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main] and its entire stack trace get split into separate events.

Tell Collectord that every event in this container starts with [:

bash
1docker run --env "discovery.type=s-node" \
2    --label 'collectord.io/logs-eventpattern=^\[' \
3    docker.elastic.co/elasticsearch/elasticsearch:6.4.0

By default Collectord joins multi-line entries written within 100ms, waits up to 1s for the next line, and caps a single combined event at 100Kb. If you see entries still being split, tune [pipe.join] in the Collectord configuration.

Application Logs

Some applications can’t redirect everything to stdout/stderr — they write to files inside the container. Audit logs, slow-query logs, GC logs, and anything that needs to survive a process restart typically end up on disk. Collectord can pick these up directly with no sidecar by mounting a volume and adding a label that names it.

The example below uses a postgres container that writes its detailed logs to /var/log/postgresql. We mount a local volume named psql_logs there, and the label collectord.io/volume.1-logs-name=psql_logs tells Collectord to scan that volume for log files. By default it picks up files matching the global glob *.log* (override per volume with collectord.io/volume.{N}-logs-glob).

For a container with multiple log directories, group settings by number — collectord.io/volume.1-logs-name, collectord.io/volume.2-logs-name, and so on.

Example 1. Forwarding application logs

bash
1docker run -d \
2    --volume psql_data:/var/lib/postgresql/data \
3    --volume psql_logs:/var/log/postgresql/ \
4    --label 'collectord.io/volume.1-logs-name=psql_logs' \
5    postgres:10.4 \
6    docker-entrypoint.sh postgres -c logging_collector=on -c log_min_duration_statement=0 -c log_directory=/var/log/postgresql -c log_min_messages=INFO -c log_rotation_age=1d -c log_rotation_size=10MB

Each event’s source includes the volume name and file — for example, psql_logs:postgresql-2018-08-31_232946.log:

text
 12018-08-31 23:31:02.034 UTC [133] LOG:  duration: 0.908 ms  statement: SELECT n.nspname as "Schema",
 2	  c.relname as "Name",
 3	  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'table' END as "Type",
 4	  pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
 5	FROM pg_catalog.pg_class c
 6	     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
 7	WHERE c.relkind IN ('r','p','')
 8	      AND n.nspname <> 'pg_catalog'
 9	      AND n.nspname <> 'information_schema'
10	      AND n.nspname !~ '^pg_toast'
11	  AND pg_catalog.pg_table_is_visible(c.oid)
12	ORDER BY 1,2;
132018-08-31 23:30:53.490 UTC [124] FATAL:  role "postgresql" does not exist

Example 2. Forwarding application logs with fields extraction and time parsing

Every label that works for container logs has a volume.{N}- equivalent for application logs — field extraction, replace patterns, index/source/host overrides, sampling, throttling. Below we extract the postgres timestamp and remove it from _raw:

bash
1docker run -d \
2    --volume psql_data:/var/lib/postgresql/data \
3    --volume psql_logs:/var/log/postgresql/ \
4    --label 'collectord.io/volume.1-logs-name=psql_logs' \
5    --label 'collectord.io/volume.1-logs-extraction=^(?P<timestamp>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{3} [^\s]+) (.+)$' \
6    --label 'collectord.io/volume.1-logs-timestampfield=timestamp' \
7    --label 'collectord.io/volume.1-logs-timestampformat=2006-01-02 15:04:05.000 MST' \
8    postgres:10.4 \
9    docker-entrypoint.sh postgres -c logging_collector=on -c log_min_duration_statement=0 -c log_directory=/var/log/postgresql -c log_min_messages=INFO -c log_rotation_age=1d -c log_rotation_size=10MB

The timestamp moves to _time, and _raw no longer carries the redundant prefix:

text
 1_time               | _raw
 22018-08-31 23:31:02 | [133] LOG:  duration: 0.908 ms  statement: SELECT n.nspname as "Schema",
 3                    | 	  c.relname as "Name",
 4                    | 	  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' WHEN 'p' THEN 'table' END as "Type",
 5                    | 	  pg_catalog.pg_get_userbyid(c.relowner) as "Owner"
 6                    | 	FROM pg_catalog.pg_class c
 7                    | 	     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
 8                    | 	WHERE c.relkind IN ('r','p','')
 9                    | 	      AND n.nspname <> 'pg_catalog'
10                    | 	      AND n.nspname <> 'information_schema'
11                    | 	      AND n.nspname !~ '^pg_toast'
12                    | 	  AND pg_catalog.pg_table_is_visible(c.oid)
13                    | 	ORDER BY 1,2;
142018-08-31 23:30:53 |  UTC [124] FATAL:  role "postgresql" does not exist

Volume types

Collectord auto-discovers application logs across two volume types: local and host mount. The Collectord configuration has [general.docker]/dockerRootFolder for finding local volumes, and [input.app_logs]/root for host mounts that may be exposed under a different path inside the Collectord container.

Forwarding Prometheus metrics

Available since Collectord version 26.04

Use this when you have applications that already expose Prometheus metrics — nginx, JVM exporters, custom apps with the prometheus client library — and you’d rather Collectord scrape them per-container than maintain a static scrape list. Collectord watches for new containers on the node and starts scraping any container that declares an endpoint via labels.

The minimum is a port. Below, we run sophos/nginx-prometheus-metrics — an nginx image that exposes its own metrics — and tell Collectord which port and path to scrape:

bash
1docker run -d \
2    --label 'collectord.io/prometheus.1-port=9527' \
3    --label 'collectord.io/prometheus.1-path=/metrics' \
4    sophos/nginx-prometheus-metrics

For details on how Collectord serializes Prometheus metrics, see Prometheus metrics.

The 1- prefix lets a single container expose multiple endpoints — use prometheus.1-*, prometheus.2-*, etc. The full set of options is in the Container labels reference: scrape interval, scheme (http/https) with TLS settings, basic auth (username/password), whitelist/blacklist regex filters, and output for routing to a specific HEC.

Forwarding Prometheus metrics to a Splunk Metrics Index

Splunk’s metrics index is a much more efficient store for high-cardinality time series than the default events index. To send Prometheus metrics there, set indexType: metrics, the target index name, and route to an output configured for a HEC token that allows that index. Make sure the index exists and is of type metrics before you start forwarding.

bash
1docker run -d \
2    --label 'collectord.io/prometheus.1-port=9527' \
3    --label 'collectord.io/prometheus.1-path=/metrics' \
4    --label 'collectord.io/prometheus.1-index=os_metrics' \
5    --label 'collectord.io/prometheus.1-output=splunk::metrics' \
6    --label 'collectord.io/prometheus.1-indexType=metrics' \
7    sophos/nginx-prometheus-metrics

When you target the metrics index, define a dedicated Splunk output whose HEC token has a metrics index as its default — the standard event-index token will reject metrics writes.

Change output destination

By default Collectord forwards everything to Splunk. Use collectord.io/output=devnull to drop a container’s data entirely — the data is still collected, it just isn’t sent anywhere. That covers spammy debug containers and short-lived utility containers you don’t care about. To drop only logs (keep metrics), use collectord.io/logs-output=devnull.

You can also flip the default: start Collectord with --env "COLLECTOR__LOGS_OUTPUT=input.files__output=devnull" so logs are dropped by default, then opt in per-container with collectord.io/logs-output=splunk. This is the cleanest pattern for hosts running many noisy containers where only a few should reach Splunk.

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/logs-output=splunk' \
4       nginx

When you have multiple Splunk outputs configured (see Support for multiple Splunk clusters) — for example, a prod cluster and a dev cluster — pick one with the output suffix:

bash
1docker run --rm \
2       --publish 80 \
3       --label 'collectord.io/output=splunk::prod1' \
4       nginx

Logs sampling

Available since Collectord version 5.6

Example 1. Random based sampling

When a container produces tens of thousands of lines per second and you only need to spot trends — error rates, latency distributions — full-volume forwarding is wasteful. collectord.io/logs-sampling-percent keeps a random percentage and drops the rest.

In the example below the application produces 300,000 lines; about 60,000 reach Splunk:

bash
1docker run -d --rm \
2    --label 'collectord.io/logs-sampling-percent=20' \
3    docker.io/mffiedler/ocp-logtest:latest \
4    python ocp_logtest.py --line-length=1024 --num-lines=300000 --rate 60000 --fixed-line

Example 2. Hash-based sampling

Random sampling breaks per-user investigation — you might keep half of a user’s events and lose the other half, making correlation impossible. Hash-based sampling fixes this: define a key (a named regex group, like a user ID or IP), and Collectord either keeps every event with that key or drops them all.

Below, we sample by client IP:

bash
1docker run -d --rm \
2    --label 'collectord.io/logs-sampling-percent=20' \
3    --label 'collectord.io/logs-sampling-key=^(?P<key>(\d+\.){3}\d+)' \
4    nginx

Thruput

Available since Collectord version 5.10.252

When one chatty container would otherwise overwhelm the HEC pipeline and starve every other container on the host, throttle it. collectord.io/logs-ThruputPerSecond caps log forwarding for that container — anything over the limit is dropped (not buffered).

bash
1docker run -d --rm \
2    --label 'collectord.io/logs-ThruputPerSecond=128Kb' \
3    nginx

Time correction

Available since Collectord version 5.10.252

When you start Collectord on a host that already has a long history of logs on disk, you usually don’t want last week’s logs in Splunk — or you want to skip the future-dated noise from a misconfigured container. collectord.io/logs-TooOldEvents and collectord.io/logs-TooNewEvents define windows around “now” outside which events are ignored.

bash
1docker run -d --rm \
2    --label 'collectord.io/logs-TooOldEvents=168h' \
3    --label 'collectord.io/logs-TooNewEvents=1h' \
4    nginx

Troubleshooting

When a label isn’t doing what you expect, check the Collectord logs for parser warnings — typos in label names show up as:

text
1WARN 2018/08/31 21:05:33.122978 core/input/annotations.go:76: invalid annotation ...

Pipes that operate on event data (field extraction, time parsing) report per-event errors in the collectord_errors field — search for collectord_errors=* in the affected index to find events that failed processing.

Reference

For the full list of every label grouped by datatype, see Container labels reference.