Outcold Solutions LLC

Monitoring OpenShift - Version 5

Installation

With our solution for Monitoring OpenShift, you can start monitoring your clusters in under 10 minutes, including forwarding metadata-enriched container logs, host logs, and metrics. Container image includes an evaluation license valid for the 30 days after the first start.

Splunk configuration

Install Monitoring OpenShift application

Install latest version of application Monitoring OpenShift from splunkbase. You need to install it on Search Heads only.

Enable HTTP Event Collector in Splunk

Outcold Solutions' Collector sends data to Splunk using HTTP Event Collector. By default, Splunk does not enable HTTP Event Collector. Please read HTTP Event Collector walkthrough to learn more about HTTP Event Collector.

The minimum requirement is Splunk Enterprise or Splunk Cloud 6.5. If you are managing Splunk Clusters with version below 6.5, please read our FAQ how to setup Heavy Weight Forwarder in between.

After enabling HTTP Event Collector, you need to find correct Url for HTTP Event Collector and generate an HTTP Event Collector Token. If you are running your Splunk instance on hostname hec.example.com, it listens on port 8088, using SSL and token is B5A79AAD-D822-46CC-80D1-819F80D7BFB0 you can test it with the curl command as in the example below.

$ curl -k https://hec.example.com:8088/services/collector/event/1.0 -H "Authorization: Splunk B5A79AAD-D822-46CC-80D1-819F80D7BFB0" -d '{"event": "hello world"}'
{"text": "Success", "code": 0}

-k is necessary for self-signed certificates.

If you use an index, that is not searchable by default, please read our documentation on how to configure indices at Splunk and inside the collector at Splunk Indexes.

OpenShift preparation

To be able to use our solution and get all the benefits, you will need to perform preparation on every OpenShift node in your cluster.

Docker logging driver

When you set up your OpenShift cluster, verify that docker uses json-file logging driver.

RHEL by default configures docker with journald. Base on your Linux distribution you can find this configuration in various places. In case of latest RHEL Server 7.5 you can find it under /etc/sysconfig/docker. Replace --log-driver=journald with --log-driver=json-file --log-opt max-size=10M --log-opt max-file=3. It is important to limit the size of the log files and number of them, see Managing Container Logs for details.

$ sed -i 's/--log-driver=journald/--log-driver=json-file --log-opt max-size=10M --log-opt max-file=3/' /etc/sysconfig/docker
$ systemctl restart docker

If you are using Red Hat Container Development Kit, it will pre-setup minishift with journald logging driver. You can change it when you start minishift for the first time with minishift start --docker-opt log-driver=json-file.

Node Labels

Configuration provides two DaemonSet workloads, one for Master nodes and one for other nodes. This configuration expects to see a node-role.kubernetes.io/master: "true" on a Master nodes. If you have deployed your OpenShift cluster with Ansible scripts - most likely you have correct labels. In case of minishift you need to add this label.

oc edit node localhost

And add a label

1
2
3
4
5
   labels:
     beta.kubernetes.io/arch: amd64
     beta.kubernetes.io/os: linux
     kubernetes.io/hostname: localhost
     node-role.kubernetes.io/master: "true"

Syslog and host logs

RHEL Server distribution might not include rsyslog installed by default. That means that all host logs (including OpenShift components and host logs) are stored only with journalctl. By default, journalctl knows how to forward logs to local syslog server. In most cases you just need to install rsyslog and after that, you will see host logs under /var/log/messages.

$ sudo yum install rsyslog
$ sudo systemctl enable rsyslog
$ sudo systemctl start rsyslog

Verify that you can see syslog messages under /var/log/messages.

$ tail /var/log/messages

Installation

Verify that you are in the context of the user who can perform admin operations (cluster-admin role).

$ oc login -u system:admin

Use latest OpenShift configuration file collectorforopenshift.yaml. This configuration deploys multiple workloads under collectorforopenshift namespace.

Open it in your favorite editor and set the Splunk HTTP Event Collector Url, token, configuration for a certificate if required, review and accept license agreement.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[general]

acceptEULA = false

...

# Splunk output
[output.splunk]

# Splunk HTTP Event Collector url
url =

# Splunk HTTP Event Collector Token
token =

# Allow invalid SSL server certificate
insecure = false

# Path to CA certificate
caPath =

# CA Name to verify
caName =

Based on the example above you will need to modify the lines as in the following.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
[general]

acceptEULA = true

...

# Splunk output
[output.splunk]

# Splunk HTTP Event Collector url
url = https://hec.example.com:8088/services/collector/event/1.0

# Splunk HTTP Event Collector Token
token = B5A79AAD-D822-46CC-80D1-819F80D7BFB0

# Allow invalid SSL server certificate
insecure = true

Apply this change to your OpenShift cluster with oc

$ oc apply -f ./collectorforopenshift.yaml

After that, you need to add privileged security context to the Service Account we use for the collector. Our workloads need to have access to host mounts and being able to run in the privileged mode to make some of the syscalls to collect the metrics.

$ oc adm policy add-scc-to-user privileged system:serviceaccount:collectorforopenshift:collectorforopenshift

If you see an error message the server could not find the requested resource, possible that you are using a mismatched version of the oc tool and the server version. You can accomplish the same by using command oc edit securitycontextconstraints privileged and adding system:serviceaccount:collectorforopenshift:collectorforopenshift to the list of users.

If you are using Red Hat certified images from registry.connect.redhat.com, make sure to specify the secret for pulling the image. See instructions on the Configuration Reference page.

Verify the workloads.

$ oc get all --namespace collectorforopenshift

If collectorforopenshift Pods aren't deployed, follow the Troubleshooting steps.

Give it a few moments to download the image and start the container. After all the pods are deployed, go to the Monitoring OpenShift application in Splunk and you should see data on dashboards.

The collector forwards by default container logs, host logs (including syslog), metrics for host, pods, containers and processes.

Next steps

  • Integrate Web Console with Monitoring OpenShift application
  • Review predefined alerts.
  • Verify configuration by using our troubleshooting instructions.
  • Enable Audit Logs. By default OpenShift does not enable Audit Logs, if you want to be able to audit activities on OpenShift API Server - you need to manually enable Audit Logs.
  • Verify Prometheus Metrics. Our configuration works in most of the times out of the box. If you will find that some of the data is not available for Control Plan, verify that you get all the Prometheus metrics and that all our configurations work in your cluster.
  • To learn how to forward application logs, please read our documentation on annotations.
  • We send the data to the default HTTP Event Collector index. For better performance we recommend at least to split logs with metrics in separate indices. You can find how to configure indexes in our guide Splunk Indices.
  • We provide flexible scheme, that allows you define search time extraction for logs in your containers. Follow the guide Splunk fields extraction for container logs to learn more.
  • You can define specific patterns for multi-line log lines; override indexes, sources, source types for the logs and metrics; extract fields, redirect some log lines to /dev/null, hide sensitive information from logs with annotations for pods.

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and operators to keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.