Outcold Solutions - Monitoring Kubernetes, OpenShift and Docker in Splunk

Monitoring Kubernetes

Splunk HTTP Event Collector

Configure HTTP Event Collector secure connection

Splunk uses self-signed certificates by default. The Collectord provides various configuration options for you to set up how it should connect to the HTTP Event Collectord.

Configure trusted SSL connection to the self-signed certificate

If you are using a Splunk self-signed certificate, you can copy the server CA certificate from $SPLUNK_HOME/etc/auth/cacert.pem and create a secret from it.

kubectl --namespace collectorforkubernetes create secret generic splunk-cacert --from-file=./cacert.pem

For every collectorforkubernetes workload (2 DaemonSets and 1 Deployment), you need to attach this secret as a volume.

...
        volumeMounts:
        - name: splunk-cacert
          mountPath: "/splunk-cacert/"
          readOnly: true
        ...
      volumes:
      - name: splunk-cacert
        secret:
          secretName: splunk-cacert
      ...

And update the ConfigMap under section [output.splunk]

[output.splunk]

# Allow invalid SSL server certificate
insecure = false

# Path to CA certificate
caPath = /splunk-cacert/cacert.pem

# CA Name to verify
caName = SplunkServerDefaultCert

In this configuration, we define the path to the CA server certificate that the collectord should trust and identify the name of the server specified in the certificate, which is SplunkServerDefaultCert in the case of the default self-signed certificate.

After applying this update, we set up a trusted SSL connection between the collectord and HTTP Event Collector.

HTTP Event Collector incorrect index behavior

HTTP Event Collector rejects payloads with indexes that the specified Token does not allow writing to. When you override indexes with annotations, it is a very common mistake to make a typo in the index name or forget to enable writing capabilities for the token in Splunk.

The Collectord provides configuration for how these errors should be handled with the incorrectIndexBehavior configuration.

  • RedirectToDefault - this is the default behavior, which forwards events with an incorrect index to the default index of the HTTP Event Collector.
  • Drop - this configuration drops events with an incorrect index.
  • Retry - this configuration keeps retrying. Some pipelines, like process stats, can be blocked for the entire host with this configuration.

You can specify behavior with the configuration.

[output.splunk]
incorrectIndexBehavior = Drop

Using proxy for HTTP Event Collector

If you need to use a proxy for HTTP Event Collector, you can define that with the configuration. If you are using an SSL connection, you need to include the certificate used by the proxy as well (similarly to how we attach the certificate for Splunk)

[output.splunk]
url = https://hec.example.com:8088/services/collector/event/1.0
token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
proxyUrl = http://proxy.example:4321
caPath = /proxy-cert/proxie-ca.pem

Using multiple HTTP Event Collector endpoints for Load Balancing and Fail-over

The Collectord can accept multiple HTTP Event Collector URLs for load balancing (in case you are using multiple hosts with the same configuration) and for fail-over.

The collectord provides you with 3 different algorithms for URL selection:

  • random - choose a random URL on first selection and after each failure (connection or HTTP status code >= 500)
  • round-robin - choose URLs starting from the first one and advance on each failure (connection or HTTP status code >= 500)
  • random-with-round-robin - choose a random URL on first selection and after that use round-robin on each failure (connection or HTTP status code >= 500)

The default value is random-with-round-robin

[output.splunk]
urls.0 = https://hec1.example.com:8088/services/collector/event/1.0
urls.1 = https://hec2.example.com:8088/services/collector/event/1.0
urls.2 = https://hec3.example.com:8088/services/collector/event/1.0

urlSelection = random-with-round-robin

token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0

Enable indexer acknowledgement

HTTP Event Collector provides Indexer acknowledgment, which allows knowing when a payload is not only accepted by HTTP Event Collector but also written to the Indexer. Enabling this feature can significantly reduce the performance of clients, including the collectord. But if you need guarantees for data delivery, you can enable it for the HTTP Event Collector token and in the collectord configuration.

[general]
acceptLicense = true

[output.splunk]
url = https://hec.example.com:8088/services/collector/event/1.0
ackUrl = https://hec.example.com:8088/services/collector/ack
token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
ackEnabled = true
ackTimeout = 3m

Client certificates for collector

If you secure your HTTP Event Collector endpoint with the requirement for client certificates, you can embed them in the image and provide configuration to use them

[output.splunk]
url = https://hec.example.com:8088/services/collector/event/1.0
token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0
clientCertPath = /client-cert/client-cert.pem
clientKeyPath = /client-cert/client-cert.key

Support for multiple Splunk clusters

If you need to forward logs from the same Kubernetes cluster to multiple Splunk clusters, you can configure additional Splunk output in the configuration

[output.splunk::prod1]
url = https://prod1.hec.example.com:8088/services/collector/event/1.0
token = AF420832-F61B-480F-86B3-CCB5D37F7D0D

All other configurations will be used from the default output output.splunk.

You can then override the outputs for Pods or Namespaces like collectord.io/output=splunk::prod1.


About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all container environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and easy-to-deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and help operators keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.