Outcold Solutions - Monitoring Kubernetes, OpenShift and Docker in Splunk

Monitoring OpenShift

Upgrade instructions

The Splunk application is backward compatible with previous versions of Collectord. We always recommend upgrading the Splunk application first, and Collectord instances after.

All minor upgrades from 5.x to 5.y can be done just by using a new version of the Collectord image. But if you want to use new features, you might need to upgrade the configurations. You can find what’s new at our blog and changes to our configuration files on GitHub outcoldsolutions/collectord-configurations (use tags to see different versions).

Upgrade from version 5.22 to 5.23

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforopenshift image to version 5.23. In the ConfigMap, add a section with [input.kubernetes_watch::nodes] to monitor nodes and be able to fill some of the Node Conditions Tables in the updated application. If you are running clusters with multiple thousands of pods, under [general.kubernetes] add watchImplementation = 2 to improve performance.

Upgrade from version 5.21 to 5.22

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforopenshift image to version 5.22.

Upgrade from version 5.20 to 5.21

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforopenshift image to version 5.21.

Upgrade from version 5.19 to 5.20

Upgrade the application in Splunk Enterprise or Splunk Cloud. Upgrade the collectorforopenshift image to version 5.20.

Review the latest configuration from configuration. If you are planning to monitor ClusterResourceQuota, add the input [input.kubernetes_watch::clusterresourcequota] and clusterresourcequotas to the list of resources for the ClusterRole collectorforopenshift.

Upgrade from version 5.18 to 5.19

Upgrade the application. Please review the latest configuration.

Enable API Gate for Collectord

When Collectord traverses ownership of the pods to collect the metadata, it can get to objects that aren’t allowed by the ClusterRole. In version 5.19, we have added a way for collectord not to try to access objects it does not have access to. In the ClusterRole collectorforopenshift, add a resource clusterroles and in the ConfigMap under [general.kubernetes] add clusterRole = collectorforopenshift to tell Collectord which ClusterRole it uses.

Enable monitoring for Node Reboot Required

A new diagnostic has been added [diagnostics::node-reboot-required]; please copy it from the configuration and review how rootfs is mounted from / to /rootfs/ instead of multiple subdirectories in previous versions. If you aren’t planning to enable this diagnostic, you can use the same mounts from the previous versions.

Upgrade openAPIV3Schema for CustomResourceDefinition

If you are planning to use force on Cluster Level Configurations, make sure to update openAPIV3Schema on the CustomResourceDefinition configurations.collectord.io.

Upgrade from version 5.17 to 5.18

Upgrade the application in Splunk and the collectorforopenshift.

Upgrade from version 5.16 to 5.17

Upgrade the application in Splunk and the collectorforopenshift. To leverage the new dashboard Resource Quotas, add the forwarding of Resource Quota objects (section [input.kubernetes_watch::resourcequota]) with the ConfigMap by copying from the latest YAML configuration available on our website.

Upgrade from version 5.15 to 5.16

Upgrade the application in Splunk and the collectorforopenshift. To leverage the new Collectord metrics dashboard, enable the input.collectord_metrics input by copying from the latest YAML configuration available on our website.

Upgrade from version 5.14 to 5.15

Upgrade the application in Splunk and the collectorforopenshift. Update YAML configurations; under input.prometheus:: add whitelists suggested by configurations from our website to reduce the number of metrics forwarded to Splunk.

Upgrade from version 5.12 to 5.14

Upgrade the application in Splunk and the collectorforopenshift.

Upgrade from version 5.11 to 5.12

Upgrade the application in Splunk and the collectorforopenshift. Monitoring OpenShift application version 5.12 is backward compatible with the previous version of collectorforopenshift. Stats input.system_stats have dedicated values for disabled, type, and output. For backward compatibility, Collectord accepts unified values from previous configurations. In the application, there are two new macros macro_openshift_stats_host and macro_openshift_stats_cgroup; for backward compatibility, they depend on the macro_openshift_stats macro. Several inputs have new types, including input.system_stats, input.proc_stats, and input.net_stats.

In the collectorforopenshift.yaml, we have added a definition for the CustomResourceDefinition of configurations.collectord.io.

Collectord can automatically watch for changes in namespaces and workloads; update the configuration stanza

[general.kubernetes]
watch.namespaces = v1/namespace
watch.deploymentconfigs = apps.openshift.io/v1/deploymentconfig
watch.configurations = collectord.io/v1/configuration

Upgrade from version 5.10 to 5.11

Upgrade the application in Splunk and the collectorforopenshift. In the YAML configuration, we have added a request for the persistentvolumeclaims in the ClusterRole. For several volumeMounts, we added a configuration mountPropagation: HostToContainer. Update your YAML configuration to be able to use PVC for application logs.

Upgrade from version 5.9 to 5.10

Upgrade the application in Splunk and the collectorforopenshift. The new dashboard Security/Objects(Pods) depends on streaming Pod objects from the API server. See the default configuration (section 004-addon.conf).

Upgrade from version 5.8 to 5.9

Upgrade the application in Splunk and the collectorforopenshift. See the release notes for the new features (including capabilities to stream API Objects and support for multiple Splunk clusters).

Upgrade from version 5.7 to 5.8

Upgrade the application in Splunk and the collectorforopenshift. The YAML configuration file now includes critical pod annotation for OpenShift versions below 3.11, and PriorityClass for OpenShift version 3.11 and above. See configuration.

Upgrade from version 5.6 to 5.7

Upgrade the application in Splunk and the collectorforopenshift. A new input is implemented input.journald; see configuration. If you have journald enabled and are also forwarding messages to /var/log/messages or /var/log/syslog files, to make sure that you aren’t going to forward the same host logs twice, you can disable rsyslog on the system (or any other alternative) and specify from what timestamp you want Collectord to pick up journald logs

[input.journald]
startFromRel = -1h

To disable the journald input

[input.journald]
disabled = true

To disable forwarding from /var/log/messages or /var/log/syslog files, use

[input.files::syslog]
disabled = true

Upgrade from version 5.5 to 5.6

Upgrade the application in Splunk and the collectorforopenshift. There are a few new parts in the ConfigMap. You only need to add them to your configuration if you are intending to use them.

  1. Under [general.kubernetes], the key includeAnnotations allows you to attach annotations, similar to labels, to the forwarded data. Unset by default.

  2. Under [input.files:*], two new keys samplingPercent and samplingKey for enabling sampling.

  3. The output [output.splunk] can now limit by the number of events in a payload with the events key.

Upgrade from version 5.4 to 5.5

Upgrade the application in Splunk and the collectorforopenshift. No additional configurations have been added.

Upgrade from version 5.3 to 5.4

Upgrade the application in Splunk and the collectorforopenshift. No additional configurations have been added.

Upgrade from version 5.2 to 5.3

Version 5.3 is a minor upgrade. Simply upgrade the Splunk application and the image. In the configuration file, you can find one new key group for [input.net_socket_table] that can significantly reduce licensing costs for the network socket table data.

Upgrade from version 5.1 to 5.2

Version 5.2 is a minor upgrade that includes performance improvements, usability improvements, and the capability of forwarding Docker and Kubelet runtime storage metrics (one additional event per host once every 30 seconds). For more details, please read Release History.

Mount metrics are defined under input.mount_stats. If you override indexes for various types of data, make sure to update these metrics as well.

Additionally, we introduced the devnull output, which allows you to disable collection of logs or metrics for specific containers.

We moved prometheus_auto from addon to general configuration, allowing pods on the host network to collect metrics from the pods running on the host network, and the addon to collect metrics from the pods network.

With version 5.2, we predefined several alerts that can help you monitor the health of your clusters and the performance of your applications.

Upgrade from version 5.0 to 5.1

Version 5.1 is a minor upgrade that includes performance improvements, usability improvements, the capability of forwarding network metrics, and autodiscovering Prometheus metrics from pods. For more details, please read Release History.

We include two new types of metrics, defined under stanzas input.net_stats (network metrics) and input.net_socket_table (table of network connections). The addon includes the input.prometheus_auto stanza that defines auto-discovery for Prometheus metrics.

Upgrade from version 4 to 5

Upgrade Splunk application

Download version 5.0 from SplunkBase.

Upgrade collector

  1. We mount /var/lib/docker instead of /var/lib/docker/containers to be able to search for application logs.
  2. We added new mount /var/lib/origin/openshift.local.volumes/ for not master nodes that allows to autodiscover application log in volumes created with emptyDir.
  3. We added new mount /var/lib/origin/ for master nodes that allows to autodiscover application log in volumes created with emptyDir and audit logs.
  4. We added imagePullPolicy: Always and changed visioning scheme to {major}.{minor}, where {major} can have breaking changes, and {minor} can be used with small updates. Patches for the base images will be delivered with the same version.

Download latest Configuration Reference, update you configuration with the changes from our configuration.

Update deployed configuration.

oc apply -f collectorforopenshift.yaml

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all container environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and easy-to-deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and help operators keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.