Outcold Solutions LLC

Monitoring Kubernetes - Version 5

Troubleshooting

For OpenShift use oc tool and namespace collectorforopenshift-syslog

Verify configuration

Get the list of the pods

$ kubectl get pods -n collectorforkubernetes-syslog
NAME                                           READY     STATUS    RESTARTS   AGE
collectorforkubernetes-syslog-addon-857fccb8b9-t9qgq   1/1       Running   1          1h
collectorforkubernetes-syslog-master-bwmwr             1/1       Running   0          1h
collectorforkubernetes-syslog-xbnaa                    1/1       Running   0          1h

Considering that we have 3 different deployment types, the DaemonSet we deploy on Masters (collectorforkubernetes-syslog-master), the DaemonSet we deploy on non-master nodes (collectorforkubernetes-syslog) and one Deployment addon (collectorforkubernetes-syslog-addon) verify one node from each deployment (in example below change the pod names to the pods that are running on your cluster).

$ kubectl exec -n collectorforkubernetes-syslog collectorforkubernetes-syslog-addon-857fccb8b9-t9qgq -- /collectord verify
$ kubectl exec -n collectorforkubernetes-syslog collectorforkubernetes-syslog-master-bwmwr -- /collectord verify
$ kubectl exec -n collectorforkubernetes-syslog collectorforkubernetes-syslog-xbnaa -- /collectord verify

OpenShift

For each command you will see an output similar to

Version = 5.2.176
Build date = 181012
Environment = kubernetes


  General:
  + conf: OK
  + db: OK
  + db-meta: OK
  + instanceID: OK
    instanceID = 2LEKCFD4KT4MUBIAQSUG7GRSAG
  + license load: OK
    trial
  + license expiration: OK
    license expires 2018-11-12 15:51:18.200772266 -0500 EST
  + license connection: OK

  Kubernetes configuration:
  + api: OK
  + pod cgroup: OK
    pods = 18
  + container cgroup: OK
    containers = 39
  + volumes root: OK
  + runtime: OK
    docker

  Docker configuration:
  + connect: OK
    containers = 43
  + path: OK
  + cgroup: OK
    containers = 40
  + files: OK

  CRI-O configuration:
  - ignored: OK
    kubernetes uses other container runtime

  File Inputs:
  x input(syslog): FAILED
    no matches
  + input(logs): OK
    path /rootfs/var/log/

Errors: 1

With the number of the errors at the end. In our example we show output from minikube, where we see some invalid configurations, like

  • input(syslog) - minikube does not persist syslog output to disk, we will not be able to see these logs in applicatione

If you find some error in the configuration, after applying the change kubectl apply -f ./collectorforkubernetes-syslog.yaml you will need to recreate pods, for that you can just delete all of them in our namespace kubectl delete pods --all -n collectorforkubernetes. The workloads will recreate them.

Describe command

When you apply annotations through the namespace, workload, configurations and pods it could be hard to track which annotations are applied to the Pod or Container. You can use a describe command of collectord to get information which annotations are used for the specific Pod. You can use any collectord Pod to run this command on the cluster

kubectl exec -n collectorforkubernetes-syslog collectorforkubernetes-syslog-master-4gjmc -- /collectord describe --namespace default --pod postgres-pod --container postgres

Collect diagnostic information

If you need to open a support case you can collect diagnostic information, including performance, metrics and configuration (excluding splunk URL and Token).

1. Collect diagnostics information run following command

Choose pod from which you want to collect a diag information.

The following command takes several minutes.

kubectl exec -n collectorforkubernetes-syslog collectorforkubernetes-syslog-master-bwmwr -- /collectord diag --stream 1>diag.tar.gz

You can extract a tar archive to verify the information that we collect. We include information about performance, memory usage, basic telemetry metrics, information file with the information of the host Linux version and basic information about the license.

2. Collect logs

kubectl logs -n collectorforkubernetes-syslog --timestamps collectorforkubernetes-syslog-master-bwmwr  1>collectorforkubernetes.log 2>&1

3. Run verify

kubectl exec -n collectorforkubernetes-syslog collectorforkubernetes-syslog-master-bwmwr -- /collectord verify > verify.log

4. Prepare tar archive

tar -czvf collectorforkubernetes-$(date +%s).tar.gz verify.log collectorforkubernetes.log diag.tar.gz

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and operators to keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.