Outcold Solutions - Monitoring Kubernetes, OpenShift and Docker in Splunk

Timestamps in container logs

It is essential to know when the application generates a specific log message. Timestamps help us to order the events. And they help us to build a correlation between systems.

That is why logging libraries write timestamps with the log messages. When you write log messages to systems like journald or Splunk, every event has a timestamp. These applications parse the timestamp from the message or generate it from the system time. When you write logs to plain text files, you lose the information about when the application created these messages. This is why you tell loggers to identify every line with a timestamp.

Let’s look at an example. A tiny Python snippet that generates a new log message when the response does not have a 200 status.

import logging

logging.basicConfig(format="%(asctime)s - %(levelname)s - %(message)s")


def handle_request(request):
    if request.statusCode != 200:
       logging.warning('not OK response')
    
    # handle request

If we handle response without an OK status, we should see a message like in the example below.

2018-08-09 21:53:42,792 - WARNING - not OK response

What can this logging message tell us? That around 2018-08-09 21:53:42,792 we had one response that did not have a 200 status code. It is not the exact time of the response. It’s possible that the function handles the response a second later after the actual response. It’s possible that the network was too slow, and the whole request took 10 seconds to complete.

Another important detail: when we write a log message, we do not inject the timestamp. The logger does it for us. It depends on the load of the system where this code is being executed. The timestamp can be a few nano- or micro- seconds off from the actual message.

Logger libraries are very similar in how they work. Below we will look into how the Python logging module emits the log messages.

If we look at the source code of the logging module, we can find where the timestamp is generated.

def __init__(self, name, level, pathname, lineno,
             msg, args, exc_info, func=None, sinfo=None, **kwargs):
    """
    Initialize a logging record with interesting information.
    """
    ct = time.time()
    self.name = name
    self.msg = msg

Based on the format, the logger prints the timestamp in human-readable form.

Another important detail about all logging libraries: they all flush the stream after emitting the message. Python logging module is not an exception.

msg = self.format(record)
stream = self.stream
stream.write(msg)
stream.write(self.terminator)
self.flush()

The flush call is very important for logging libraries. By default, operating systems create buffered standard output and error streams. Based on the attached reader, the behavior could be different. Without flushing, your messages could be stuck in the buffer until the buffer is full or the system closes the stream. I recommend looking at the stdio and fflush man pages to learn more.

Let’s summarize what we have learned so far:

  • Timestamps are essential in logs. They provide order and help correlate events.
  • Timestamps might not represent the exact moment when application code writes the log message.
  • Logging libraries avoid keeping records in buffers. All messages should be available to readers almost instantly.

It is time to look at how Docker handles container logs. In the example below, we use Docker, but other container runtimes like CRI-O have similar implementations.

Let’s generate Docker container logs. For that, we can run a docker run command.

$ docker run --name python_example python:3 python -c 'import logging; logging.basicConfig(format="%(asctime)s - %(levelname)s - %(message)s"); logging.warning("not OK response");'
2018-08-10 17:59:54,582 - WARNING - not OK response

It is a short-lived container, but we have not removed it after execution, so we can look at the logs.

$ docker logs python_example
2018-08-10 18:00:56,086 - WARNING - not OK response

We can use the --timestamp option with the docker logs command.

$ docker logs --timestamps python_example
2018-08-10T18:00:56.086989400Z 2018-08-10 18:00:56,086 - WARNING - not OK response

As you can see, every message in Docker logs also has a timestamp. If we round the timestamp generated by Docker, we can see that it differs by one millisecond (2018-08-10T18:00:56.086989400Z ~= 2018-08-10T18:00:56.087).

Docker generates this timestamp with the Copier.

if msg.PLogMetaData == nil {
    msg.Timestamp = time.Now().UTC()
} else {
    msg.Timestamp = partialTS
}

Docker attaches the Copier to the container’s standard output and error pipes.

copier := logger.NewCopier(map[string]io.Reader{"stdout": container.StdoutPipe(), "stderr": container.StderrPipe()}, l)
container.LogCopier = copier
copier.Run()
container.LogDriver = l

The Copier reads the streams and forwards messages to loggers.

Let’s summarize what we learned about the Docker logger system:

  • Docker generates another timestamp for every log message.
  • There could be a slight delay between when an application writes the message and the Docker timestamp. This delay depends on how busy your system is.
  • We have not talked about this, but it is essential to know: If the Copier with the Logger isn’t able to keep up with the number of logs, it might block the standard output of the container. This can result in performance degradation of your application. You will not see spikes in CPU usage, but peaks in IO wait.
  • None of the timestamps represent the actual moment when the event happened.
  • Logger drivers use the timestamp generated by Docker.

Our suggestion: Avoid generating timestamps with logging libraries. Docker always attaches timestamps to the logs. This will help you reduce disk space for storing the logs. And if you use logging systems where you pay for the amount of injected data (like AWS CloudWatch or Splunk), this will help you reduce costs.


About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all container environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and easy-to-deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and help operators keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.