Monitoring Kubernetes

Splunk HTTP Event Collector

Configure HTTP Event Collector secure connection

Splunk ships with self-signed certificates by default, so most production deployments need to tell Collectord either how to trust those certs or how to use your own. The relevant knobs all live under [output.splunk] in the configuration.

Configure trusted SSL connection to the self-signed certificate

If you’re sticking with Splunk’s self-signed certificate, copy the server CA from $SPLUNK_HOME/etc/auth/cacert.pem and load it as a Kubernetes secret:

bash
1kubectl --namespace collectorforkubernetes create secret generic splunk-cacert --from-file=./cacert.pem

Mount that secret into every collectorforkubernetes workload — both DaemonSets and the Deployment:

yaml
 1...
 2        volumeMounts:
 3        - name: splunk-cacert
 4          mountPath: "/splunk-cacert/"
 5          readOnly: true
 6        ...
 7      volumes:
 8      - name: splunk-cacert
 9        secret:
10          secretName: splunk-cacert
11      ...

Then point Collectord at the mounted certificate and tell it which name to verify on the server cert. For Splunk’s default self-signed cert, that name is SplunkServerDefaultCert:

ini
 1[output.splunk]
 2
 3# Allow invalid SSL server certificate
 4insecure = false
 5
 6# Path to CA certificate
 7caPath = /splunk-cacert/cacert.pem
 8
 9# CA Name to verify
10caName = SplunkServerDefaultCert

After this rolls out, Collectord talks to HEC over a fully verified TLS connection.

HTTP Event Collector incorrect index behavior

HEC rejects events whose target index isn’t on the token’s allow-list — and once you start overriding indexes via annotations, it’s easy to typo a name or forget to enable a new index on the token. Collectord lets you choose how to react with incorrectIndexBehavior:

  • RedirectToDefault — the default. Re-routes the rejected event to the token’s default index so nothing is lost.
  • Drop — drops the event outright. Use this when you’d rather see gaps in Splunk than have unrouted events polluting the default index.
  • Retry — keeps retrying. Useful only when you can fix the index on the Splunk side quickly — otherwise the affected pipeline (for example, process stats) will stall for the entire host.

Set it under [output.splunk]:

ini
1[output.splunk]
2incorrectIndexBehavior = Drop

Using proxy for HTTP Event Collector

If your network forces outbound traffic through a proxy, point Collectord at it with proxyUrl. When the proxy itself terminates TLS, include its CA the same way you would for Splunk:

ini
1[output.splunk]
2url = https://hec.example.com:8088/services/collector/event/1.0
3token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
4proxyUrl = http://proxy.example:4321
5caPath = /proxy-cert/proxie-ca.pem

Using multiple HTTP Event Collector endpoints for Load Balancing and Fail-over

When you have several HEC endpoints — typically a heavy forwarder pool or a dedicated indexer cluster — you can list them all and let Collectord spread the load and survive failures. A “failure” here means a connection error or any HTTP status >= 500.

You get three selection algorithms:

  • random — pick a random URL on first send and after every failure.
  • round-robin — start at the first URL and advance one position on every failure.
  • random-with-round-robin — pick a random URL on first send, then round-robin from there on every failure. This is the default.
ini
1[output.splunk]
2urls.0 = https://hec1.example.com:8088/services/collector/event/1.0
3urls.1 = https://hec2.example.com:8088/services/collector/event/1.0
4urls.2 = https://hec3.example.com:8088/services/collector/event/1.0
5
6urlSelection = random-with-round-robin
7
8token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0

Enable indexer acknowledgement

By default, HEC tells Collectord a payload was accepted as soon as it lands on the receiver — not when it’s actually persisted to an indexer. If you need stronger delivery guarantees, turn on Indexer acknowledgment on the token and on Collectord. It does cost throughput — every payload now waits for the index to confirm — so enable it only where the guarantee matters.

ini
1[general]
2acceptLicense = true
3
4[output.splunk]
5url = https://hec.example.com:8088/services/collector/event/1.0
6ackUrl = https://hec.example.com:8088/services/collector/ack
7token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
8ackEnabled = true
9ackTimeout = 3m

Client certificates for collector

If your HEC endpoint requires mTLS, embed the client certificate and key in the image and point Collectord at them:

ini
1[output.splunk]
2url = https://hec.example.com:8088/services/collector/event/1.0
3token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
4clientCertPath = /client-cert/client-cert.pem
5clientKeyPath = /client-cert/client-cert.key

Support for multiple Splunk clusters

When the same Kubernetes cluster needs to forward to more than one Splunk cluster — say, a primary indexing tier and a separate tier for a security team — define a named output alongside the default:

ini
1[output.splunk::prod1]
2url = https://prod1.hec.example.com:8088/services/collector/event/1.0
3token = AF420832-F61B-480F-86B3-CCB5D37F7D0D

Anything not specified on the named output falls back to settings on output.splunk.

You can then send specific pods or namespaces to the secondary cluster with an annotation like collectord.io/output=splunk::prod1.