Monitoring Kubernetes

Installation

This guide walks you through installing Monitoring Kubernetes end-to-end: configuring the Splunk app and HTTP Event Collector, preparing the cluster, and deploying Collectord to forward metadata-enriched container logs, host logs, and metrics. A typical install takes under 10 minutes. If you don’t have a license yet, you can request a 30-day evaluation.

Splunk configuration

Install the Monitoring Kubernetes application

Install the latest version of Monitoring Kubernetes from Splunkbase on your Search Heads only.

If you’re using a dedicated index that isn’t searchable by default, update the macro_kubernetes_base macro to include it:

text
1macro_kubernetes_base = (index=kubernetes)

Enable HTTP Event Collector in Splunk

Collectord forwards data to Splunk over the HTTP Event Collector (HEC). If HEC isn’t enabled yet, follow Splunk’s guide to Configure the Splunk HTTP Event Collector for use with additional technologies.

Once HEC is enabled, you need two pieces of information for the rest of this guide: the HEC endpoint URL and an HEC token. You can verify both with curl:

bash
1$ curl -k https://hec.example.com:8088/services/collector/event/1.0 \
2       -H "Authorization: Splunk B5A79AAD-D822-46CC-80D1-819F80D7BFB0" \
3       -d '{"event": "hello world"}'
4{"text": "Success", "code": 0}

Don’t set a source type on the token — Collectord assigns its own source types per datatype, and the Splunk app expects them.

-k skips certificate validation; use it only for self-signed certificates.

Splunk Cloud uses a different HEC URL than Splunk Web — see Send data to HTTP Event Collector on Splunk Cloud instances.

If your indexes aren’t searchable by default, see Splunk Indexes for how to configure them on both the Splunk side and inside Collectord.

Install Collectord for Kubernetes

For Docker UCP installation, see the blog post Monitoring Docker Universal Control Plane (UCP) with Splunk Enterprise and Splunk Cloud.

Pre-requirements

Collectord works out of the box with CRI-O, Containerd, and Docker as runtime engines.

The most important thing to get right before you install is container log rotation. Some Kubernetes providers ship aggressive defaults — AKS, for example, rotates across just 5 files of 10 MiB. Those files are your safety buffer for any disruption between Collectord and Splunk HEC — a connectivity issue, an HEC outage, a misconfiguration. The math is straightforward: a container writing 10 MiB per hour gives you a 5-hour buffer before the oldest log is overwritten; at 10 MiB per minute, that buffer shrinks to 5 minutes.

Check your provider’s defaults and raise them if they’re too tight. Rotation is normally controlled through the KubeletConfiguration (or kubelet command-line flags). At minimum, aim for 5 files of 128 MiB — tune from there based on how much log volume your pods produce:

yaml
1kind: KubeletConfiguration
2apiVersion: kubelet.config.k8s.io/v1beta1
3...
4containerLogMaxSize: 128Mi
5containerLogMaxFiles: 5

Installation

If you prefer Helm, see the collectord-splunk-kubernetes Helm chart.

Download the latest collectorforkubernetes.yaml. The manifest creates the collectorforkubernetes namespace and deploys every workload it needs.

Open the file and edit it to:

  • Set the Splunk HEC URL, token, and (if needed) certificate options.
  • Review and accept the license agreement and paste in your license key.
  • Optionally, name the cluster — useful when you’re monitoring more than one and want to filter by cluster in the app.
001-general.conf ini
 1[general]
 2
 3acceptLicense = false
 4
 5license =
 6
 7fields.kubernetes_cluster = -
 8
 9...
10
11# Splunk output
12[output.splunk]
13
14# Splunk HTTP Event Collector url
15url =
16
17# Splunk HTTP Event Collector Token
18token =
19
20# Allow invalid SSL server certificate
21insecure = false
22
23# Path to CA certificate
24caPath =
25
26# CA Name to verify
27caName =

A filled-in example:

001-general.conf ini
 1[general]
 2
 3acceptLicense = true
 4
 5license = ...
 6
 7fields.kubernetes_cluster = development
 8
 9...
10
11# Splunk output
12[output.splunk]
13
14# Splunk HTTP Event Collector url
15url = https://hec.example.com:8088/services/collector/event/1.0
16
17# Splunk HTTP Event Collector Token
18token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
19
20# Allow invalid SSL server certificate
21insecure = true

If you’re deploying onto a cluster that’s been running for a while and has a lot of historical logs on disk, Collectord will start by forwarding all of them — which can spike both your network and your Splunk indexing. Use the [general] settings thruputPerSecond to cap throughput and tooOldEvents to skip events older than a given age.

Apply the manifest:

bash
1$ kubectl apply -f ./collectorforkubernetes.yaml

Check that the workloads came up:

bash
1$ kubectl get all --namespace collectorforkubernetes

If the pods aren’t running, see Troubleshooting.

Once the images are pulled and the pods are Running, open the Monitoring Kubernetes app in Splunk — dashboards should start populating within a minute or two.

By default, Collectord forwards container logs, host logs (including syslog), and metrics for hosts, pods, containers, and processes.

Next steps

  • Review the predefined alerts and enable the ones relevant to your environment.
  • If something looks off, work through the troubleshooting checks.
  • Enable Audit Logs. Kubernetes doesn’t write API server audit logs out of the box; you’ll need to enable them on the API server before Collectord can forward them.
  • Verify Prometheus Metrics. The default configuration covers most clusters, but if Control Plane metrics are missing in the dashboards, double-check that the Prometheus endpoints are reachable from Collectord.
  • For better indexing performance, split logs and metrics into separate indexes. See Splunk Indexes for the recommended layout and how to wire it up on both sides.
  • For search-time field extraction on container logs, see Splunk fields extraction for container logs.
  • For per-pod control over log forwarding — application logs, custom log file paths, multi-line patterns, field extraction, index/source/sourcetype overrides, dropping noisy lines, and hashing sensitive values — see annotations.