Outcold Solutions - Monitoring Kubernetes, OpenShift and Docker in Splunk

Monitoring OpenShift

OpenShift Objects

Starting with version 5.9, you can stream all the changes from the Kubernetes API server to Splunk. This is useful if you want to monitor all changes for the workloads or ConfigMaps in Splunk, or if you want to recreate a Kubernetes Dashboard experience in Splunk. With the default configuration, we don’t forward any objects from the API Server except events.

Configuration

In the ConfigMap configuration for the collectorforopenshift, you can add additional sections for the 004-addon.conf. In the example below, we will forward all objects every 10 minutes (refresh) and stream all changes immediately for the pods, deployments, and ConfigMaps.

[input.kubernetes_watch::pods]

# disable events
disabled = false

# Set the timeout for how often watch request should refresh the whole list
refresh = 10m

apiVersion = v1
kind = Pod
namespace =

# override type
type = openshift_objects

# specify Splunk index
index =

# set output (splunk or devnull, default is [general]defaultOutput)
output =

# exclude managed fields from the metadata
excludeManagedFields = true


[input.kubernetes_watch::deployments]

# disable events
disabled = false

# Set the timeout for how often watch request should refresh the whole list
refresh = 10m

apiVersion = apps/v1
kind = Deployment
namespace =

# override type
type = openshift_objects

# specify Splunk index
index =

# set output (splunk or devnull, default is [general]defaultOutput)
output =

# exclude managed fields from the metadata
excludeManagedFields = true


[input.kubernetes_watch::configmap]

# disable events
disabled = false

# Set the timeout for how often watch request should refresh the whole list
refresh = 10m

apiVersion = v1
kind = ConfigMap
namespace =

# override type
type = openshift_objects

# specify Splunk index
index =

# set output (splunk or devnull, default is [general]defaultOutput)
output =

# exclude managed fields from the metadata
excludeManagedFields = true

If you need to stream other kinds of objects, you can find the list of available objects in the Kubernetes API reference. You need to find the correct apiVersion and kind. All the object kinds with the group core should have an apiVersion v1. Other groups, like apps, should have an apiVersion apps/v1. You can also specify a namespace if you want to forward objects only from a specific namespace.

Modifying objects

Available in collectorforopenshift:5.19+

If you want to hide sensitive information or remove some properties from the objects before streaming them to Splunk, you can use the modifyValues. modifier in the ConfigMap.

As an example, if you add the ability for Collectord to query Secrets from the Kubernetes API (you need to add secrets under resources for the collectorforkubernetes ClusterRole, see below), you can define a new input

[input.kubernetes_watch::secrets]
disabled = false
refresh = 10m
apiVersion = v1
kind = Secret
namespace =
type = openshift_objects
index =
output =
excludeManagedFields = true
# hash all fields before sending them to Splunk
modifyValues.object.data.* = hash:sha256
# remove annotations like last-applied-configuration not to expose values by accident
modifyValues.object.metadata.annotations.kubectl* = remove

In that case, all the values for keys under object.data will be hashed, and annotations that start with kubectl will be removed (this is a special case to remove the last-applied-configuration annotation, which can expose those secrets).

The syntax of modifyValues. is simple; everything that comes after is a path with a simple glob pattern where * can be at the beginning of the path property or at the end. The value can be a function remove or hash:{hash_function}. The list of hash functions is the same as what can be applied with annotations.

Filtering objects

Available in collectorforkubernetes:5.21+

If you want to filter objects based on the namespaces, you can configure under a specific input as follows

[input.kubernetes_watch::pods]
# You can exclude events by namespace with blacklist or whitelist only required namespaces
# blacklist.kubernetes_namespace = ^namespace0$
# whitelist.kubernetes_namespace = ^((namespace1)|(namespace2))$

For example, you can tell Collectord to stream all the pods except those from the namespace0 namespace

[input.kubernetes_watch::pods]
blacklist.kubernetes_namespace = ^namespace0$

Or stream only the pods from the namespace1 and namespace2 namespaces.

[input.kubernetes_watch::pods]
whitelist.kubernetes_namespace = ^((namespace1)|(namespace2))$

ClusterRole rules

Please check the collectorforopenshift.yaml configuration with the list of apiGroups and resources that collectord has access to. In the example above, we request streaming of the ConfigMaps, which we don’t provide access to in our default configuration. To be able to stream this type of object, we need to add an additional resource to the ClusterRole

apiVersion: v1
kind: ClusterRole
metadata:
  labels:
    app: collectorforopenshift
  name: collectorforopenshift
rules:
- apiGroups:
  - ""
  - apps
  - batch
  - extensions
  - monitoring.coreos.com
  - apps.openshift.io
  - build.openshift.io
  resources:
  - alertmanagers
  - buildconfigs
  - builds
  - cronjobs
  - daemonsets
  - deploymentconfigs
  - deployments
  - endpoints
  - events
  - jobs
  - namespaces
  - nodes
  - nodes/metrics
  - nodes/proxy
  - pods
  - prometheuses
  - replicasets
  - replicationcontrollers
  - scheduledjobs
  - services
  - statefulsets
  - configmaps
  verbs:
  - get
  - list
  - watch
- nonResourceURLs:
  - /metrics
  verbs:
  - get

Applying the changes

After you make changes to the ConfigMap and ClusterRole, recreate or restart the addon pod (the pod with the name similar to collectorforopenshift-addon-XXX). You can just delete this pod and the Deployment will recreate it for you.

Searching the data

With the configuration in the example above, the collectord will resend all the objects every 10 minutes and stream all the changes immediately. If you are planning to run the join command or populate the lookups, make sure that your search command covers more than the refresh interval; you can use, for example, 12 minutes.

The source name

By default, the source will be in the format /openshift/{namespace}/{apiVersion}/{kind}, where namespace is the namespace that is used to make a request (from the configuration, not the actual namespace of the object), and apiVersion and kind are also from the configuration that makes the API request to the API Server.

Attached fields

With the events, we also attach openshift_namespace and openshift_node_labels, which will help you find the objects from the right namespace or from the right cluster (if the node labels have a cluster label).

Event format

We forward objects wrapped in the watch object, which means that every event has an object field and a type (ADDED, MODIFIED, or DELETED).

Objects

Searching the data

Considering that in the same time frame you can have the same object more than once (for example, if the object has been modified several times in 10 minutes), you need to group the objects by the unique identifier.

sourcetype="openshift_objects" source="/openshift//v1/pod" |
spath output=uid path=object.metadata.uid |
stats latest(_raw) as _raw, latest(_time) as _time by uid, openshift_namespace |
spath output=name path=object.metadata.name |
spath output=creationTimestamp path=object.metadata.creationTimestamp |
table openshift_namespace, name, creationTimestamp
Table

Example. Extracting limits

A more complicated example of how to extract container limits and requests (CPU, Memory, and GPU)

sourcetype="openshift_objects" source="/openshift//v1/pod" |
spath output=uid path=object.metadata.uid |
stats latest(_raw) as _raw, latest(_time) as _time by uid, openshift_namespace |
spath output=pod_name path=object.metadata.name |
spath output=containers path=object.spec.containers{} | 
mvexpand containers |
spath output=container_name path=name input=containers  | 
spath output=limits_cpu path=resources.limits.cpu input=containers |
spath output=requests_cpu path=resources.requests.cpu input=containers |
spath output=limits_memory path=resources.limits.memory input=containers |
spath output=requests_memory path=resources.requests.memory input=containers |
spath output=limits_gpu path=resources.limits.nvidia.com/gpu input=containers |
spath output=requests_gpu path=resources.requests.nvidia.com/gpu input=containers |
table openshift_namespace, pod_name, container_name, limits_cpu, requests_cpu, limits_memory, requests_memory, limits_gpu, requests_gpu

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all container environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and easy-to-deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and help operators keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.