Outcold Solutions LLC

Monitoring Docker - Release History

5.3 - 2018-11-19

Requires collectorfordocker version 5.3 or above

  • New: Alert for showing when Collectord reports errors in Processing pipelines (as an example if it failed to extract fields).
  • New: Alert for showing when Collectord reports warnings.
  • New: Alert if lag in the indexing of the data.
  • New: Splunk Usage (License usage, number of events) report under Setup.
  • Fixed: lookup with alerts causing very often replication activities on SHC

Collectord updates:

  • Fixed: high memory usage with Gzip compression enabled (reduced memory usage).
  • New: Allow to disable pipe.join with annotations.
  • Fixed: In high amount of logs (10,000 events per second) Collectord can read lines not in full, that can break JSON logs.
  • Fixed: When collectord writes a Warning that it failed to post to Splunk, it will write a Success message after retry.
  • New: Allow to hash sensitive data with annotations.
  • Fixed: Group network socket tables to reduce the amount of forwarded data (4 times reducing the amount of data)
  • Fixed: Identify when glob and match pattern require recursive directory traversal.
  • New: Annotation for complete disabling of the handling and forwarding logs for containers.
  • Fixed: Collectord showed few Debug messages on start.
  • Fixed: Performance improvements for log forwarding (up to 35% in high amount of logs).
  • Fixed: reduce duplication of the Kubernetes events, forwarded to Splunk.
  • Fixed: Support ECS cgroup matching with the default configuration.
  • Fixed: Support docker daemon logs forwarding with the default configuration.
  • Fixed: Failed to parse proc name from the stat file with the not paired parentheses.

5.2 - 2018-10-15

Requires collectorfordocker version 5.2 or above

  • New: Review/Storage dashboard based on storage metrics.
  • New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
  • Performance improvements

Collector updates:

  • New: runtime storage metrics (usage, available, inodes)
  • New: image is built on top of SCRATCH image.
  • New: verify and diag commands for troubleshooting.
  • New: support /dev/null output for logs
  • New: override source/sourcetype and index base on regexp pattern for container logs.
  • Fixed: do not send empty docker_labels
  • New: support docker JSON tags and labels
  • Fixed: allowing a new license to unblock collector with the expired license.
  • Fixed: Prometheus parser fails to parse metrics with labels that end with a comma.
  • Fixed: Performance improvements
  • New: Prometheus parser supports basic authentication
  • Fixed: Workaround for a bug in HTTP Event Collector, that can return an incorrect index of a failed event

5.1 - 2018-09-17

Requires collectorfordocker version 5.1 or above

  • New: Network metrics (MB, Packets, Drops and Errors) for host and containers.
  • New: Network socket tables (list of port that containers and hosts are listen on, connections to external resources).
  • New: Network review dashboard to see the list of connection to public services and in private network.
  • Improvement: Replace python-based lookup with macro written with eval.
  • Improvement: Visual improvement for showing when the object was Last Seen (highlighting and showing minutes ago).
  • Improvement: Changed source of proc stats to proc root filesystem, to keep minimum list of unique sources.
  • New: Support for Splunk multi-threads outputs (for forwarding more than 3000 events per second).
  • Improvement: Performance improvements for Prometheus parsing.
  • Improvement: Calculate checksums for first bytes of files, to better identify new files with reused iNode.
  • Improvement: Reduce amount of metrics forwarded with proc_stats by excluding system threads.
  • Improvement: Configuration for gzip compression.

5.0 - 2018-09-03

Requires collectorfordocker version 5.0 or above

  • Auto-discover and forward Application logs from host mounts or local volumes.
  • Annotations for containers to change per container configurations (index, source, join rules, replaces and more).
  • Escaping terminal sequences from container logs.
  • Redirecting logs to /dev/null for specific patterns.
  • Replace patterns in container and application logs (hiding sensitive or not important information).
  • Support for extracting fields from the container logs, including timestamps.
  • Support for forwarding Prometheus metrics.
  • Include Memory and CPU limits for container lists.
  • Visual updates for the panels, highlighting high CPU and Memory usages
  • Filter cgroup stats, forward only container and host metrics.
  • Support for multiple Splunk HTTP Event Collector endpoints (support fail-over and load-balancing).
  • Handle HTTP Event Collector errors with the incorrect index. Multiple options to redirect to default index, drop or wait.
  • Add retry logic to license client to reduce amount of false positive warnings.
  • Add HTTP read timeouts (handle gateway timeouts, 504).
  • Performance optimizations.
  • Optimize payloads for higher write throughput.
  • Fixed: fail to parse the latest line in the JSON log.
  • Better error handling incorrect configurations.
  • Deprecating Join rules in favour of annotations.
  • Support for HTTP Event Collector client certificates.
  • Fixed: limit directory walkers for depth (fixing issues when directory has a mount to itself)
  • Fixed: add a limit of the maximum line size that collector can read at once (defaults to 1Mb).
  • Fixed: acknowledgement database stores now NodeID, DevID and a parent folder identifier. That way if NodeID is going to be reused right away - we will identify this file as a new one, if it is in different location.
  • Change: docker_stream field has been renamed to stream for compatibility with other container runtime.

3.0 - 2018-02-07

Requires collectorfordocker version 3.0 or above

  • Added CPU Quota, CPU Shares, Throttled and Memory Limit Overlays on Container Dashboards.
  • Performance improvement on Dashboards by combining multiple charts using one common search.
  • New "Review/Privileged containers and enabled capabilities" dashboard to list all privileged containers and enabled security capabilities for containers.
  • Fixed bug on Process Dashboard, some charts did not filter by host.
  • "Overview: Process" dashboard did not use Span token for timechart dashboards.
  • "Top: Containers" fixed incorrect memory usage (showed double size)
  • Added alerts in application for notification about outdated collector versions and expired licenses for collector.
  • Hide Wait Read/Write IO panels, when this data is not available.
  • In process Dashboard show VmRSS with RssAnon, RssFile, and RssShmem.

Collector updates:

  • Support for Splunk indexing acknowledgment.
  • HTTP Proxy support for License server and Splunk output.
  • Allow to configure destination indices for different types of data in collector configuration (stats, logs, host logs, proc stats and events).
  • Handling responses from HTTP Event Collector to skip invalid events (will be logged).
  • If container is running, but Docker does not provide metadata, allow to wait for metadata.
  • Collect security capabilities and uid/gid.
  • Support for custom labels, specified with collector configuration.
  • Support for partial logs without join rules.
  • Bug. Use local timezone by default for local syslog files.
  • Bug. Fix small memory leak on deleted containers.
  • Bug. When collector is failing to send data to Splunk, impossible to stop collector with terminate.

collectorfordocker 3.0.94.180730

  • Show the index name in the output, when Splunk reports incorrect index.

collectorfordocker 3.0.93

  • Fixed: Support for Docker running on CentOS (metadata is not attached to metrics).

collectorfordocker 3.0.91

  • Fixed: Messages "WARN ... proc.go:441: Unparsable line from /rootfs/proc/X/status" caused by new Linux kernel that reports empty line in proc file system.
  • Add HTTP read timeouts (handle gateway timeouts, 504).
  • Correctly parse HTTP Event Responses when one of few events fail to be indexed (as an example, wrong index).

Upgrade from version 2 to 3

2.1 - 2017-10-22

Requires collectorfordocker version 2.1.59.171210 or above

  • Implemented collectors dashboard to track number of collectors, their versions and used licenses.
  • Fallback to the process IO statistics when blkio is not available.
  • Fix IO statistic graphs, showed average, when sum should be used.
  • collector - Improved resistance for storage failures.
  • collector - License checks reporting.

2.0 - 2017-10-22

Requires collectorfordocker version 2.0.37.171023 or above

  • Better labels support in Dashboards. Collector has a breaking feature, replacing format for labels from docker_labels_LABEL1=VALUE1 to docker_labels=[LABEL1=VALUE1,LABEL2=VALUE2].
  • Process level metrics.
  • Uptime for hosts and processes.
  • Fields extraction and support in dashboards for docker daemon (setup host logs collection with collector).
  • New top dashboards allow to monitor Hosts/Containers/Processes in real-time.
  • Improved dashboards navigation.
  • Support for host logs.
  • Other bugs and improvements based on user feedback.

About Outcold Solutions

Outcold Solutions provides solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer certified Splunk applications, which give you insights across all containers environments. We are helping businesses reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers. We deliver applications, which help developers monitor their applications and operators to keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer one solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance.