Outcold Solutions - Monitoring Kubernetes, OpenShift and Docker in Splunk

Monitoring Amazon Elastic Container Service (ECS) Clusters with Splunk Enterprise and Splunk Cloud

[UPDATE (2018-06-15)] Based on Amazon ECS Adds Daemon Scheduling, we updated our blog post to show how you can schedule our collectord on ECS by using the new Daemon Scheduling.

[UPDATE (2018-10-15)] Updated to Monitoring Docker v5.2

Amazon EC2 Container Service (ECS) is a highly scalable, high-performance container management service that supports Docker containers and allows you to easily run applications on a managed cluster of Amazon EC2 instances.

Because ECS runs Docker as a Container Engine, our solution for Monitoring Docker works out of the box with it as well.

In our example, we used ECS and Splunk deployed in the same Region and the same VPC. But there are no special requirements for your Splunk Enterprise deployment. You can also use Splunk Cloud with our solution. The only requirement is to give the ECS cluster access to the Splunk HTTP Event Collector endpoint, which is usually deployed on port 8088.

ECS in AWS

We expect that you have already finished the Splunk configuration from our manual Monitoring Docker Installation, installed our application Monitoring Docker, and enabled HTTP Event Collector in your Splunk environment.

First, you need to create a new Task Definition

ECS task definition

At the very bottom, you can find a Configure JSON button

Configure JSON

Use the following template to create a Task Definition, where at a minimum you need to set the URL for your HTTP Event Collector and specify your HTTP Event Collector Token and include the license key (request an evaluation license key with this automated form). You can also revisit Memory and CPU limits specific to your load, and you can adjust those later.

{
    "containerDefinitions": [
        {
            "name": "collectorfordocker",
            "image": "outcoldsolutions/collectorfordocker:{{ collectorfordocker_version }}",
            "memory": "256",
            "cpu": "256",
            "essential": true,
            "portMappings": [],
            "environment": [
                {
                    "name": "COLLECTOR__SPLUNK_URL",
                    "value": "output.splunk__url=https://hec.example.com:8088/services/collector/event/1.0"
                },
                {
                    "name": "COLLECTOR__SPLUNK_TOKEN",
                    "value": "output.splunk__token=B5A79AAD-D822-46CC-80D1-819F80D7BFB0"
                },
                {
                    "name": "COLLECTOR__SPLUNK_INSECURE",
                    "value": "output.splunk__insecure=true"
                },
                {
                    "name": "COLLECTOR__ACCEPTLICENSE",
                    "value": "general__acceptLicense=true"
                },
                {
                    "name": "COLLECTOR__LICENSE",
                    "value": "general__license=..."
                },
                {
                   "name": "COLLECTOR__EC2_INSTANCE_ID",
                   "value": "general__ec2Metadata.ec2_instance_id=/latest/meta-data/instance-id"
                },
                {
                   "name": "COLLECTOR__EC2_INSTANCE_TYPE",
                   "value": "general__ec2Metadata.ec2_instance_type=/latest/meta-data/instance-type"
                }
            ],
            "mountPoints": [
                {
                    "sourceVolume": "cgroup",
                    "containerPath": "/rootfs/sys/fs/cgroup",
                    "readOnly": true
                },
                {
                    "sourceVolume": "proc",
                    "containerPath": "/rootfs/proc",
                    "readOnly": true
                },
                {
                    "sourceVolume": "var_log",
                    "containerPath": "/rootfs/var/log",
                    "readOnly": true
                },
                {
                    "sourceVolume": "var_lib_docker_containers",
                    "containerPath": "/rootfs/var/lib/docker/",
                    "readOnly": true
                },
                {
                    "sourceVolume": "docker_socket",
                    "containerPath": "/rootfs/var/run/docker.sock",
                    "readOnly": true
                },
                {
                    "sourceVolume": "collector_data",
                    "containerPath": "/data",
                    "readOnly": false
                }
            ],
            "volumesFrom": null,
            "hostname": null,
            "user": null,
            "workingDirectory": null,
            "privileged": true,
            "readonlyRootFilesystem": true,
            "extraHosts": null,
            "logConfiguration": null,
            "ulimits": null,
            "dockerLabels": null,
            "logConfiguration": {
                "logDriver": "json-file",
                "options": {
                    "max-size": "1m",
                    "max-file": "3"
                }
            }
        }
    ],
    "volumes": [
        {
            "name": "cgroup",
            "host": {
                "sourcePath": "/cgroup"
            }
        },
        {
            "name": "proc",
            "host": {
                "sourcePath": "/proc"
            }
        },
        {
            "name": "var_log",
            "host": {
                "sourcePath": "/var/log"
            }
        },
        {
            "name": "var_lib_docker_containers",
            "host": {
                "sourcePath": "/var/lib/docker/"
            }
        },
        {
            "name": "docker_socket",
            "host": {
                "sourcePath": "/var/run/docker.sock"
            }
        },
        {
            "name": "collector_data",
            "host": {
                "sourcePath": "/var/lib/collectorfordocker/data/"
            }
        }
    ],
    "networkMode": null,
    "memory": "256",
    "cpu": "0.5 vcpu",
    "placementConstraints": [],
    "family": "collectorfordocker",
    "taskRoleArn": ""
}

After saving it, you can use this Task Definition on your ECS clusters.

The next step is to schedule this Task on every host we have. Open your cluster and go to the Services tab

Create a service

Create a new service with the following configuration:

  • Launch type - EC2
  • Task Definition - choose the just-created task definition collectorfordocker
  • Cluster - the name of your ECS cluster
  • Service name - collectorfordocker
  • [UPDATE (2018-06-15)] Service Type - DAEMON
  • Minimum healthy percent - keep defaults (we aren’t going to use them)
  • On the second step, choose: Load balancer type - None
  • On the third step: Service Auto Scaling - Do not adjust the service’s desired count
Create a service

After creating the service, give it a minute to download the image and set up our collectord. If everything works as expected, you should see data in the Monitoring Docker application.

Configure JSON

Known issues

  • By default, collectord picks up Docker daemon logs from /rootfs/var/log/docker. You need to update macro macro_docker_host_logs_docker and change it to
(`macro_docker_host_logs` AND source="*var/log/docker*")

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all container environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and easy-to-deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and help operators keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.