Outcold Solutions - Monitoring Kubernetes, OpenShift and Docker in Splunk

Monitoring Kubernetes

Upgrade instructions

The Splunk application is backward compatible with previous versions of Collectord. We always recommend upgrading the Splunk application first, and Collectord instances after.

All minor upgrades from 5.x to 5.y can be done just by using a new version of the Collectord image. But if you want to use new features, you might need to upgrade the configurations. You can find what’s new on our blog and changes to our configuration files on GitHub outcoldsolutions/collectord-configurations (use tags to see different versions).

Upgrade from version 5.22 to 5.23

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforkubernetes image to version 5.23. In the ConfigMap, add a section with [input.kubernetes_watch::nodes] to monitor Nodes and be able to fill some of the Node Conditions Tables in the updated application. If you are running clusters with thousands of pods, under [general.kubernetes] add watchImplementation = 2 to improve performance.

Upgrade from version 5.21 to 5.22

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforkubernetes image to version 5.22.

Upgrade from version 5.20 to 5.21

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforkubernetes image to version 5.21.

Upgrade from version 5.19 to 5.20

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforkubernetes image to version 5.20.

If you are running some containers scheduled with Docker and not within Kubernetes, you might want to keep the section [general.docker] and enable the flag enableOwnWatcher = true to make sure that Collectord is watching for Docker containers. If you are not using Docker, you can remove the section [general.docker] from the YAML configuration.

Upgrade from version 5.18 to 5.19

Upgrade the application. Please review the latest configuration.

Enable API Gate for Collectord

When Collectord traverses ownership of the Pods to collect metadata, it can encounter objects that aren’t allowed by the ClusterRole. In version 5.19, we have added a way for collectord not to try to access objects it does not have access to. In the ClusterRole collectorforkubernetes, add a resource clusterroles and in the ConfigMap under [general.kubernetes] add clusterRole = collectorforkubernetes to tell Collectord which ClusterRole it uses.

Enable monitoring for Node Reboot Required

A new diagnostics section has been added [diagnostics::node-reboot-required]—please copy it from the configuration and review how rootfs is mounted from / to /rootfs/ instead of multiple subdirectories in previous versions. If you aren’t planning to enable this diagnostics, you can use the same mounts from the previous versions.

Upgrade openAPIV3Schema for CustomResourceDefinition

If you are planning to use force on Cluster Level Configurations, make sure to update openAPIV3Schema on the CustomResourceDefinition configurations.collectord.io.

Upgrade from version 5.17 to 5.18

Upgrade the application in Splunk and the collectorforkubernetes image.

Upgrade from version 5.16 to 5.17

Upgrade the application in Splunk and the collectorforkubernetes image. To leverage the new Resource Quotas dashboard, add the forwarding of Resource Quota objects (section [input.kubernetes_watch::resourcequota]) to the ConfigMap by copying from the latest YAML configuration available on our website.

Upgrade from version 5.15 to 5.16

Upgrade the application in Splunk and the collectorforkubernetes image. To leverage the new Collectord metrics dashboard, enable the input.collectord_metrics input by copying from the latest YAML configuration available on our website.

Upgrade from version 5.14 to 5.15

Upgrade the application in Splunk and the collectorforkubernetes image. Update YAML configurations—under input.prometheus:: add the whitelists suggested by configurations from our website to reduce the number of metrics forwarded to Splunk.

Upgrade from version 5.12 to 5.14

Upgrade the application in Splunk and the collectorforkubernetes image. If planning to use Containerd as a runtime engine, update the YAML file to include the mounts for the Containerd Unix socket.

Upgrade from version 5.11 to 5.12

Upgrade the application in Splunk and the collectorforkubernetes image. Monitoring Kubernetes application version 5.12 is backward compatible with the previous version of collectorforkubernetes.

Stats input.system_stats have dedicated values for disabled, type, and output. For backward compatibility, Collectord accepts unified values from previous configurations. In the application, there are two new macros macro_kubernetes_stats_host and macro_kubernetes_stats_cgroup—for backward compatibility, they depend on the macro_kubernetes_stats macro. Several inputs have new types, including input.system_stats, input.proc_stats, and input.net_stats.

In the collectorforkubernetes.yaml, we have added a definition for the CustomResourceDefinition of configurations.collectord.io.

Collectord can now automatically watch for changes in namespaces and workloads—update the configuration stanza

[general.kubernetes]
watch.namespaces = v1/namespace
watch.deployments = apps/v1/deployment
watch.configurations = collectord.io/v1/configuration

Upgrade from version 5.10 to 5.11

Upgrade the application in Splunk and the collectorforkubernetes image. In the YAML configuration, we have added a request for persistentvolumeclaims in the ClusterRole. For several volumeMounts, we added a configuration mountPropagation: HostToContainer. Update your YAML configuration to be able to use PVC for application logs.

Upgrade from version 5.9 to 5.10

Upgrade the application in Splunk and the collectorforkubernetes image. The new Security/Objects(Pods) dashboard depends on streaming Pod objects from the API server. See the default configuration (section 004-addon.conf).

Upgrade from version 5.8 to 5.9

Upgrade the application in Splunk and the collectorforkubernetes image. See the release notes for new features (including capabilities to stream API Objects and support for multiple Splunk Clusters).

Upgrade from version 5.7 to 5.8

Upgrade the application in Splunk and the collectorforkubernetes image. The YAML configuration file now includes critical pod annotation for Kubernetes versions below 1.14, and PriorityClass for Kubernetes versions 1.14 and above. See configuration.

Upgrade from version 5.6 to 5.7

Upgrade the application in Splunk and the collectorforkubernetes image. A new input input.journald is implemented—see configuration. If you have journald enabled and are also forwarding messages to /var/log/messages or /var/log/syslog files, to make sure that you don’t forward the same host logs twice, you can disable rsyslog on the system (or any other alternative) and specify from which timestamp you want Collectord to pick up journald logs

[input.journald]
startFromRel=-1h

To disable the journald input

[input.journald]
disabled=true

To disable forwarding from /var/log/messages or /var/log/syslog files use

[input.files::syslog]
disabled = true

Upgrade from version 5.5 to 5.6

Upgrade the application in Splunk and the collectorforkubernetes image. There are a few new parts in the ConfigMap. You only need to add them to your configuration if you intend to use them.

  1. Under [general.kubernetes], the key includeAnnotations allows you to attach annotations, similar to labels, to the forwarded data. Unset by default.

  2. Under [input.files:*], two new keys samplingPercent and samplingKey for enabling sampling.

  3. The output [output.splunk] can now limit by the number of events in a payload with the events key.

Upgrade from version 5.4 to 5.5

Upgrade the application in Splunk and the collectorforkubernetes image. No additional configurations have been added.

Upgrade from version 5.3 to 5.4

Upgrade the application in Splunk and the collectorforkubernetes image. No additional configurations have been added.

Upgrade from version 5.2 to 5.3

Version 5.3 is a minor upgrade. Simply upgrade the Splunk application and the image. In the configuration file, you can find one new key group for [input.net_socket_table] that can significantly reduce licensing costs for the network socket table data.

Upgrade from version 5.1 to 5.2

Version 5.2 is a minor upgrade that includes performance improvements, usability improvements, and the capability of forwarding Docker and Kubelet runtime storage metrics (one additional event per host every 30 seconds). For more details, please read Release History.

Mount metrics are defined under input.mount_stats. If you override indexes for various types of data, make sure to update these metrics as well.

Additionally, we introduced devnull output, which allows you to disable collection of logs or metrics for specific containers.

We moved prometheus_auto from addon to general configuration, allowing pods on the host network to collect metrics from the pods running on the host network, and the addon to collect metrics from the pods network.

With version 5.2, we predefined several alerts that can help you monitor the health of your clusters and the performance of your applications.

Upgrade from version 5.0 to 5.1

Version 5.1 is a minor upgrade that includes performance improvements, usability improvements, the capability of forwarding Network Metrics, and autodiscovery of Prometheus metrics from Pods. For more details, please read Release History.

We include two new types of metrics, defined under stanzas input.net_stats (network metrics) and input.net_socket_table (table of network connections). The addon includes an input.prometheus_auto stanza that defines auto-discovery for Prometheus metrics.

Upgrade from version 4 to 5

Upgrade Splunk application

Download version 5.0 from SplunkBase.

Upgrade collector

  1. We mount /var/lib/docker instead of /var/lib/docker/containers to be able to search for application logs.
  2. We added a new mount /var/lib/kubelet/ that allows autodiscovery of application logs in volumes created with emptyDir.
  3. We added imagePullPolicy: Always and changed the versioning scheme to {major}.{minor}, where {major} can have breaking changes, and {minor} can be used with small updates. Patches for the base images will be delivered with the same version.

Download the latest Configuration Reference and update your configuration with the changes from our configuration.

Update deployed configuration.

kubectl apply -f collectorforkubernetes.yaml

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all container environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and easy-to-deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and help operators keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.