Monitoring Amazon EKS with Splunk Enterprise and Splunk Cloud
Congratulations to the AWS team for shipping such a great product. Based on the data provided by CNCF, more than half of all companies who run Kubernetes choose to do so on AWS. Managing the Control Plane is not the most straightforward task. EKS does that for you. The only thing that is up to you is to bootstrap worker nodes and run your applications.
Amazon Elastic Container Service for Kubernetes (Amazon EKS) is a managed service that makes it easy for you to run Kubernetes on AWS without needing to stand up or maintain your own Kubernetes control plane.
We are proud to announce that our solution for Monitoring Kubernetes works with Amazon EKS from day one.
To get started follow the Installation instructions and use appropriate configuration for the specific version of Kubernetes. At this moment only Kubernetes version 1.10 can be deployed on EKS.
In our example, we used EKS and Splunk deployed in the same Region and the same VPC. But there are no special requirements for your Splunk Enterprise deployment. You can also use Splunk Cloud with our solution. The only requirement is to give the EKS cluster access to the Splunk HTTP Event Collector endpoint, which is usually deployed on port 8088.

After performing all the steps from the Installation instructions, you will see that the DaemonSet for worker nodes will schedule Pods with our collectord on every worker node, and one addon Pod will be deployed for collecting Kubernetes events. Because you don’t have access to the Master nodes, you can delete the DaemonSet for masters or safely ignore it.
With the default configuration, you will get metrics from the worker nodes. You will see detailed metrics for the nodes, pods, containers, and processes. Container and host logs will be automatically forwarded as well.

From the control plane, you will be able to see the Kubelet metrics in the application.

You will be able to review Network

And monitor PVC and Instance storage usage

We have over 30 alerts pre-built for you, which will highlight issues with your deployments and workloads you are running

All other Cluster information will be unavailable because you don’t have access to the metrics of the Scheduler, etcd, and controller. But you can still collect metrics from the API Server. By default, in our configuration we expect every collector on master nodes to collect metrics from the Kubernetes API processes. But because in the case of EKS you don’t have access to the Master nodes, you can schedule collection of the Kubernetes API from the addon.
In our configuration file, find the section of ConfigMap with the file definition for the addon 004-addon.conf and
add a section as in the example below (lines 6-42).
1 004-addon.conf: |
2 [general]
3
4 ...
5
6 [input.prometheus::kubernetes-api]
7
8 # disable prometheus kubernetes-api metrics
9 disabled = false
10
11 # override type
12 type = prometheus
13
14 # specify Splunk index
15 index =
16
17 # override host
18 host = kubernetes-eks-api-server
19
20 # override source
21 source = kubernetes-api
22
23 # how often to collect prometheus metrics
24 interval = 60s
25
26 # prometheus endpoint
27 endpoint.kubeapi = https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}/metrics
28
29 # token for "Authorization: Bearer $(cat tokenPath)"
30 tokenPath = /var/run/secrets/kubernetes.io/serviceaccount/token
31
32 # server certificate for certificate validation
33 certPath = /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
34
35 # client certificate for authentication
36 clientCertPath =
37
38 # Allow invalid SSL server certificate
39 insecure = true
40
41 # include metrics help with the events
42 includeHelp = false
After that, restart the addon pod. Find the pod id
1$ kubectl get pods --namespace collectorforkubernetes
2NAME READY STATUS RESTARTS AGE
3collectorforkubernetes-addon-546bd58878-4qk44 1/1 Running 0 48m
4collectorforkubernetes-g2wbg 1/1 Running 0 55m
5collectorforkubernetes-gwdg5 1/1 Running 0 55m
6collectorforkubernetes-rsh44 1/1 Running 0 55m
And delete the addon pod with
1$ kubectl delete pod collectorforkubernetes-addon-546bd58878-4qk44 --namespace collectorforkubernetes
2pod "collectorforkubernetes-addon-546bd58878-4qk44" deleted
A new pod will be scheduled with updated configurations. In a few minutes, you should be able to see API Kubernetes Metrics in our application.

Links
If you are getting errors when trying to access the API from CLI, like
error: the server doesn't have a resource type "cronjobs"orerror: You must be logged in to the server (Unauthorized), check the article Common errors when setting up EKS for the first time. You need to be sure that you are creating the EKS cluster with the same IAM that is going to access the API. In our case, we were using MFA for managing temporary sessions, which caused errors similar to those described above.