Monitoring Kubernetes

Release history

26.04.1 - 2026-04-27

Supports collectorforkubernetes version 26.04.x and below

  • New: Advanced Kubernetes Events dashboards
  • Updated GPU Monitoring Dashboard to the latest version of the NVIDIA CLI tool
  • Collectord metrics dashboard does not apply filters to some of the panels
  • Bug fix: On the overview dashboard, the application can report some not-ready containers incorrectly

Collectord updates:

  • Updated Go runtime to 1.26.2
  • Updated SQLite Library to 3.53
  • FIPS-compliant image
  • Changed default API version for Docker to v1.44
  • Improved on-volume database for PVC volumes, preventing database corruption
  • When using index acknowledgment (ack), Collectord now uses exponential backoff
  • Collectord describe command now shows the source of each annotation
  • Collectord now allows overriding event outputs with collectord.io/events-output=splunk::foo
  • Collectord now can forward Prometheus metrics from Docker containers by using annotations
  • Bug fix: Collectord can panic on close when some output cannot post data to the destination
  • Bug fix: If Docker does not connect to the unix socket, it uses a file-based API that cannot stream events, but does so quietly
  • Bug fix: Collectord crashes when the Docker unix socket is set incorrectly
  • Bug fix: Global throughput pipeline can be too aggressive and start throttling before reaching the configured maximum
  • Bug fix: When using multiple outputs and splitting Kubernetes events into them, an unavailable output might block the whole pipeline
  • Bug fix: Collectord can stop reading from rotated log files on slow systems with quick file rotation
  • Bug fix: Fixed small memory leak in the acknowledgment database
  • Bug fix: Splunk output can crash with dedicatedClientPerIndex and an incorrect output for the index
  • Bug fix: When using weighted thread balancing for output, high-throughput event publishing could previously stall momentarily while weight tracking caught up
  • Bug fix: When using multi-output with a misconfigured output name, the failed output would permanently fall back to the default output for the lifetime of the process
  • Bug fix: Fixed file descriptor leak when log files rotate faster than the polling interval
  • Bug fix: Reduced duplicate log events that could be forwarded after a Collectord restart
  • Bug fix: Improved log forwarding throughput during temporary disk pressure
  • Bug fix: With cgroup v1, Collectord can report IO Wait as 0% for some containers
  • Bug fix: Duplicate events forwarded when container log pipelines are recreated under throughput throttling
  • Bug fix: Collectord Cluster Level Annotations (CRD Configuration) with force flag does not override pod-level annotations
  • Bug fix: After a user deletes CRD Configurations, they might still be cached in Collectord memory until the next restart
  • Bug fix: Duplicate events could occur for short-lived or slow-starting containers
  • Bug fix: If Collectord addon starts first on the node, it might try to write to a read-only database, fail, and restart
  • Bug fix: Collectord might keep watching deleted pods until it recreates the connection to the Pods watch stream
  • Bug fix: Collectord might send duplicate events when volume log files are closed and reopened during collection
  • Bug fix: Collectord keeps trying to read from stale NFS file handlers
  • Bug fix: Reduced duplicate log events that could be forwarded during container lifecycle changes

25.10.3 - 2025-11-17

Collectord updates:

  • Update go to 1.25.4.
  • Bug fix: Collectord can crash when Swap is enabled on Kubernetes nodes.

25.10.2 - 2025-10-27

Supports collectorforkubernetes version 25.10.x and below

  • New versioning scheme YY.MM instead of symver.
  • Redesigned Audit (Overview) dashboard
  • Hosts dashboard might not show host logs.
  • Added [ui] supportedThemes light and dark for the application.

Collectord updates:

  • Upgrade golang to 1.25.2.
  • Generate improvements in watchImplementation=2, including warnings for mis-configurations.
  • Support for source configuration underinput.files::XXX.
  • Support for loading tokens from Secrets for user defined Splunk outputs with CRD.
  • Collectord now attaches seed field for all prometheus inputs, including defined for the services and pods via annotations.
  • In the global sanitation allow hashing of sensitive data.
  • Allow disabling global replace and hash pipes.
  • Collectord now creates internal watch requests to Kubernetes API server with randomized timesouts (~0.9-1.0 of configured) to spread the load.
  • Preconfigured time extraction for the audit logs.
  • Added ability to filter out events in input.kubernetes_watch::XXX by values in the JSON object, not only attached fields.
  • Published helm charts for Collectord.

5.24.445 - 2025-11-17

Collectord updates:

  • Update go to 1.24.10.
  • Bug fix: Collectord can crash when Swap is enabled on Kubernetes nodes.

5.24.444 - 2025-06-23

Collectord updates:

  • Bug fix (regression): Collectord might hang while making requests to the Kubernetes API server.

5.24.443 - 2025-06-10

Collectord updates:

  • Upgrade golang to 1.24.4.
  • Add the ability to configure idle connections for Splunk outputs (maxIdleConns, maxIdleConnsPerHost, idleConnTimeout).
  • Improvements for the Kubernetes Events input for large clusters.

5.24.442 - 2025-05-16

Collectord updates:

  • Upgrade golang to 1.24.3.
  • Bug fix: Collectord might not pick up files from volumes with subdirectories when recursive is set to false.

5.24.441 - 2025-05-05

Collectord updates:

  • Bug fix: Rare situation when Collectord might not pick up the logs from the container, when an init container is used and might be running for a long time.

5.24.440 - 2025-04-28

Supports collectorforkubernetes version 5.24.x and below

  • New alert: Cluster Warning: Node Condition
  • Bug fix: Workload dashboard might show a warning for the Events table telling about using wildcard in the middle of the string

Collectord updates:

  • Upgrade golang to 1.24.2.
  • Upgrade SQLite to 3.48.0.
  • Added ability to hide process command line arguments.
  • Prometheus metrics can be forwarded to Splunk Metrics Index.
  • Allow configuring TLS Version and show TLS version used for outputs using the collectord verify command.
  • To make watchImplementation=2 compatible with previous version, attach the Kind and apiVersion to objects forwarded from the list calls.
  • Collectord verify command shows logging driver configured for Docker Daemon.
  • Added ability to parse unixtimestamp in the application logs with format @unixtimestamp
  • In case the kubernetes configuration for volumesRoot or container logs path is pointed to a symlink, collectord will use the real path of the symlink.
  • Ability to keep diag-X.tar.gz files in the /data folder in case streaming from stdout/stderr causes errors (Windows or macOS).
  • Add a limit for maximum files opened for a container (default is 10), for situations of constant restart of the container.
  • Significantly reduce the amount of writes to the acknowledgment database, and improve performance.
  • Change default limit for paginating requests from 100 to 500.
  • Add ability to lock files in PVC volumes to allow only one concurrent Collectord instance to access the file.
  • Bug fix: hash basicauth credentials in the diag output of Collectord.
  • Bug fix: fix misprint in the Collectord internal metrics requsts to requests.
  • Bug fix: collectord might report 0 as an internal metric for forwarded Kubernetes and Docker events.
  • Bug fix: rare panic in the rotated.(*rotateFileInput).setupCurrent.
  • Bug fix: application logs forwarding might report “failed to resolve relative path”.
  • Bug fix: application logs can pick up compressed files (.gz) in the directory.
  • Bug fix: with misconfigured License Server, verify and diag commands can panic.

5.23.432 - 2025-02-12

Collectord updates:

  • Upgrade SQLite to 3.47.2.
  • Upgrade golang to 1.23.6.
  • Bug fix: Collectord verify command can result in panic when Collectord uses License Server.

5.23.431 - 2024-11-18

Supports collectorforkubernetes version 5.23.x and below

  • Update application for Splunk Cloud compatibility

Collectord updates:

  • Upgrade SQLite to 3.47.0.
  • Upgrade golang to 1.23.3.

5.23.430 - 2024-10-28

Supports collectorforkubernetes version 5.23.x and below.

  • To better support installations with large number of nodes and containers, default behavior for most of the dashboards is to require pressing a Submit button after selecting filters.
  • Overview Dashboard - new table with Not Ready Containers.
  • Pod Dashboard - include container statuses table.
  • Audit Dashboard - include user agent, and update compatibility with latest audit formats.
  • Audit Dashboards - small performance improvement for the new installations.
  • Host dashboard - show node conditions table.
  • Host dashboard - show only external eth* interfaces in network stats.

Collectord updates:

  • Implement new and improved watch mechanism for Kubernetes resources to handle large clusters.
  • Change the default pipe join configuration to have max size of 1MB instead of 100KB.
  • Allow defining outputs for prometheus metrics defined with annotations.
  • When HTTP Server is enabled for Collectord, it writes every call to stdout; make it configurable.
  • Bug fix: Collectord did not respect proxyBasicAuth for the splunk output.
  • Bug fix: Collectord verify command can report incorrectly the status of containerd runtime.
  • Upgrade SQLite to 3.46.1.
  • Upgrade golang to 1.23.2.

5.22.422 - 2024-06-17

  • Bug fix: Fix issue with calculating values on Resource Quota dashboard.

Collectord updates:

  • Upgrade SQLite to 3.46.0.
  • Upgrade golang to 1.22.4.

5.22.421 - 2024-05-13

Collectord updates:

  • Allow spawning journald log reader in a separate process, to prevent corrupted logs from crashing the main process.
  • Upgrade golang to 1.22.3.

5.22.420 - 2024-04-22

Supports collectorforkubernetes version 5.22.x and below

  • Workload dashboard - add Pod OwnerKind and OwnerName, PriorityClass, and Pod Requests/Limits
  • Address too many data points in host and workload dashboard in network graphs
  • Additional CPU Metrics: CPU IOWait, Steal and Idle in Top Hosts dashboards.
  • Showing CPU IOWait in Host dashboard.
  • Alert Container CPU Throttled - exclude container with low CPU usage.
  • New dashboard Review->Disk Stats for the host.
  • Exclude virtual ethernet interfaces from host dashboard.
  • Support memory limits and requests expressed in milli-bytes.

Collectord updates:

  • Allow disabling IP address Lookup in net_socket_table input.
  • Better handling of zombie processes in proc_stats input.
  • Allow configuring user Splunk outputs using CDR SplunkOutput.
  • Allow blacklisting labels from forwarded metadata.
  • When onVolumeDatabase is used Collectord verifies that volume supports locking.
  • Add additional metrics CPU IOWait, Steal and Idle.
  • Monitoring disk stats for the host.
  • Add input disk_stats.
  • New diagnostic - CPU Vulnerabilities.
  • Improve check for the Kubernetes API endpoint in verify command.
  • Deprecate diagnostic for entropy.
  • Upgrade default API Version to 1.24 for Docker endpoints.
  • License Client - allow configuring the proxy.
  • Bug fix: ignore containers with completed status.
  • Bug fix: don’t include containers with completed status (init) containers for the Pod requests and limits.
  • Bug fix: if container does not generate a lot of logs, some messages can stack in queue while waiting for more messages.
  • Bug fix: Collectord describe command can crash if user fields are defined with annotations on the pod.
  • Upgrade golang to 1.22.2.
  • Upgrade sqlite3 to 3.45.3.

5.21.412 - 2024-01-08

Collectord updates:

  • Add libdl.so.2 library to the scratch image for compatibility with Aqua Security
  • Upgrade SQLite to 3.44.2
  • Upgrade Go language runtime to 1.21.5

5.21.411 - 2023-11-28

Collectord updates:

  • Bug fix: Collectord might send events without timestamps
  • Upgrade Go language runtime to 1.21.4

5.21.410 - 2023-10-16

Supports collectorforkubernetes version 5.21.x and below

  • Compatibility updates for the version 5.21 of Collectord
  • New Dashboard: Review -> CPU (Throttled, Limits, Requests)
  • Alert update: High amount of GRPC errors
  • Alert update: Container CPU Throttled
  • Network tables update: show UDP connections for Host, Workloads, Containers, and Pods
  • Network Connection Dashboard: allows filtering by namespaces
  • Show maximum and average number of Pods per cluster in Clusters (Allocations and usage) dashboard
  • Update Resource Quota dashboard to support comparing milli-cores and cores

Collectord updates:

  • Support for global replace configurations for Collectord, allowing to sanitize data before forwarding to Splunk
  • Support journald as logging driver for container logs
  • When both volatile and persistent journald destination exist, Collectord will identify which has the most recent data
  • Support for configuring modify values for specific namespaces when streaming objects
  • Support for arrays in modify values for the streaming objects from Kubernetes/OpenShift API server
  • Allow sending to Splunk more precise timestamps for the events
  • Collectord can automatically refresh tokens when they are expired for API Server
  • Compatibility updates for latest versions of Kubernetes
  • Upgrade Go language runtime to 1.21.3
  • Upgrade sqlite3 library to 3.43.1
  • Upgrade libc and common base libraries to debian:bookworm

5.20.404 - 2025-07-03

Collectord updates:

  • Compatibility updates for the Elasticsearch/OpenSearch configurations.

5.20.403 - 2023-07-31

Collectord updates:

  • Improvements for working with NFS shares and closed file handlers.
  • Improvements for streaming Pods from Kubernetes API server.
  • Collectord reports when the Splunk HEC Collector does not reply with the correct response with 200 status code.
  • Upgrade go runtime to version 1.20.6.
  • Bug fix: Collectord might report invalid memory usage for the stopped containers.
  • Bug fix: If collectord fails to initialize on volume database, that might crash whole Collectord instance.

5.20.402 - 2023-06-06

Collectord updates:

  • Bug fix: onvolumedatabase annotation does not work when ignoreCSIMountFolderForDiscovery is enabled
  • Bug fix: Splunk output might send event_id field when includeEventID is not enabled
  • Allow configuring timeout-seconds for collecting diag

5.20.401 - 2023-05-22

Collectord updates:

  • Upgrade go runtime to version 1.20.4
  • Allow users to configure how many events Collectord can have in the output pipeline to lower memory footprint
  • Include iNode and DevID in the info.txt in diag
  • Bug fix: Collectord cannot collect performance metrics in diag
  • Bug fix: Collectord can start forwarding logs from the older file position than in the acknowledgement database

5.20.400 - 2023-04-17

Supports collectorforkubernetes version 5.20.x and below

  • Show Pod conditions on the Pod dashboard
  • Bug fix: Pods dashboard filters out pods not on the host network.
  • Compatibility updates for the version 5.20 of Collectord

Collectord updates:

  • Multi-architecture images for amd64 and arm64
  • Allow sending logs to multiple Splunk HEC endpoints simultaneously
  • New annotation collectord.io/volume.{N}-logs-onvolumedatabase to keep acknowledgement information about forwarded logs on the volume
  • Allow including placeholder templates in the annotation collectord.io/volume.{N}-logs-glob
  • Support for new outputs (ElasticSearch and OpenSearch)
  • Collectord produces diag file without performance data, if flag --include-performance-profiles is not set
  • Use IMDSv2 for AWS metadata
  • Performance improvements for an acknowledgement database
  • Improvements for the acknowledgement database on how long Collectord keeps the data by refreshing the state, if file still exists on the disk
  • Upgrade Go language runtime to 1.20.3
  • Collectord verifies that only one Collectord instance can access the data folder, where Collectord stores its state
  • Remove automatic watching for Docker runtime on Kubernetes/OpenShift hosts
  • Add a verify step for Containerd runtime for the verify command
  • Add ability to send events with event_id, unique identifier for the messages generated from logs
  • Bug fix: Collectord might assign processes running outside of the containers on the host to the Collectord container
  • Bug fix: CPU-based license tries to connect to the license server, when running verify command
  • Bug fix: Collectord might not set a source to the log files for non-default splunk output

5.19.391 - 2023-03-07

Collectord updates:

  • Upgrade go runtime to 1.19.7
  • For CSI volumes, Collectord allows to ignore the “mount” subdirectory with configuration ignoreCSIMountFolderForDiscovery under input.app_logs

5.19.390 - 2022-10-17

Supports collectorforkubernetes version 5.19.x and below

  • Update dashboards for latest changes in the metric names for API Server, Controller and Scheduler
  • Update Kubelet dashboard to support various container runtimes
  • Audit (users and namespaces) dashboard: show access to non-namespaces resources
  • Logs dashboard: show container and pod as separate filters
  • New alert for Collectord alarms for node diagnostics (reboot required, and entropy)
  • Bug fix: misprint in “Cluster Warning: container cpu is throttled” alert

Collectord updates:

  • Splunk output supports maximumMessageLength to truncate messages exceeding this size
  • Splunk output supports requireExplicitIndex to ignore all events that don’t have explicit index defined
  • Collectord monitors if node requires reboot
  • Input Kubernetes watch allows now to hash or remove values from JSON before sending them to Splunk
  • Collectord now reads its own clusterrole and implements a gate, that does not allow it to invoke requests to API server, that it does not have access to
  • Instead of using automatic gate based on clusterrole, admin can define list of objects Collectord should use to load metadata for the Pods
  • Update configurations for latest versions of Kubernetes to support various CRI runtimes
  • Update configuration to use control-plane role instead of master (as the last one is deprecated)
  • Improved support for CSI volumes, automatically discover additional sub directory “mount”
  • Allow to force override annotations from cluster level configurations
  • Upgrade go runtime to 1.19.2
  • Beta: weighted splunk output algorithm when multiple threads used
  • Bug fix: if docker runtime is not installed, Collectord can clog the output with warnings
  • Bug fix: verify command can report an error with journald, when it properly works
  • Bug fix: Collectord can clog the output if cgroupv2 is used, and blkio is not enabled
  • Bug fix: Collectord can crash if default output.splunk is not configured, now it shows the error
  • Bug fix: If output is not defined for Kubernetes Watch input, it should use default output
  • Bug fix: if Kubernetes watch connection fails, Collectord can generate a lot of requests to API Server

5.18.381 - 2022-05-17

Collectord updates:

  • Update go runtime to 1.17.11
  • When Splunk HEC is slow, and cannot process the events, Collectord might hold on the files in PVC volume, preventing kubelet to stop the application pod. Collectord now has a configuration for how long it can keep the file descriptors for when pod is terminated.
  • Bug fix: When Splunk HEC is unavailable, Collectord can start closing dedicated Splunk outputs for Indexes
  • Bug fix: When Splunk HEC returns code 4xx, unrecognized by the format of Splunk HEC, Collectord might incorrectly skip the event
  • Bug fix: Collectord builds incorrect path for the Kubernetes API service, when watchin for some objects, like gateway
  • Bug fix: Verify command does not respect cgroup v2

5.18.380 - 2022-04-19

Supports collectorforkubernetes version 5.18.x and below

  • Cluster filter on Events dashboard
  • Rewrite CPU throttled alert to make it less verbose
  • Memory usage now reports memory without caches and memory that can be freed.
  • Support cgroupv2

Collectord updates:

  • Support cgroupv2
  • New ability to specify the message field name for the logs extraction with annotations extractionMessageField
  • Collectord improves grace period for expired licenses allowing to bootstrap new nodes for 14 days
  • Support of journald database written with systemd library 247+
  • Upgrade go runtime to 1.17.9
  • Bug fix: cleanup the diag, exclude the real license key
  • Bug fix: collectord reports high CPU usage for just started containers or hosts
  • Bug fix: update pods/container labels when user updates them (prior restart was required)
  • Bug fix: set now as a date for container logs with corrupted log files instead of 0 timestamp
  • Bug fix: include the values of whitelists and blacklists in diag
  • Bug fix: verify command might incorrectly show that it cannot find container logs with CRIO runtime

5.17.370 - 2021-10-20

Supports collectorforkubernetes version 5.17.x and below

  • Show milicores/cores CPU usage instead of percents
  • New dashboard: Review - Resource Quotas
  • Review - Projects: filter by project name
  • Review - Clusters: filter by node label
  • Review - Clusters: include max and avg usage
  • Bug fix: storage dashboard might not render in some Splunk versions
  • Bug fix: Namespaces dashboard shows only one namespace label

Collectord updates:

  • Upgrade to Go 1.17.2
  • Support query in Prometheus URLs for metrics
  • Collectord now reports source and source type for the events with incorrect index
  • Support for licensing server
  • Support for CPU-based licenses
  • Allow to specify multiple values for blacklist and whitelist for host logs
  • Bug fix: Collectord clogs the output with WARN messages for stopped containers running with Containerd
  • Bug fix: Containers with not set requests might show 1core request by default
  • Bug fix: Collectord clogs the output with WARN messages about closed Splunk outputs
  • Bug fix: parse commas in the timestamps for logs

5.16.363 - 2021-05-26

  • Bug fix: Put in parentheses source selection in macro_openshift_prometheus_metrics

Collectord updates:

  • Upgrade go runtime to 1.16.3
  • Bug fix: fix verbose logging for docker watcher with messages “failed to get next event”
  • Bug fix: NetworkPolicy cannot be watched, as Collectord does not convert it in plural form properly
  • Bug fix: Verify command fails on Containerd runtime
  • Bug fix: DefaultIdleConnTimeout is ignored for HTTP clients
  • Bug fix: Put in parentheses source selection in macro_kubernetes_prometheus_metrics

5.16.361 - 2021-03-16

Supports collectorforkubernetes version 5.16.x and below

  • Overview dashboard filters respect filters (show only namespaces from selected cluster)
  • Bug fix: use correct units for Memory and Storage (MiB, MB, Mb)
  • Bug fix: compatibility with new format of Events from API server (FirstSeen, LastSeen, Source could be shown as null)
  • Bug fix: Collectord metrics request time shows the summary on the period, not the individual request times

Collectord updates:

  • ARM64 image
  • Allow removing managed fields from events (enabled with new configurations by default)
  • Upgrade to Go 1.16.2
  • Bug fix: precise time to Splunk HEC, sending with milliseconds instead of nanoseconds (which are incorrectly ronded by HEC)
  • Bug fix: first sample of the container can record above 100% of the CPU usage, as the values are pretty low
  • Bug fix: verify command does not respect glob patterns for Prometheus inputs (certs, tokens)
  • Bug fix: trim spaces in token value for Prometheus inputs

5.16.353 - 2021-02-11

Collectord updates:

  • Bug fix: collectord can report parse int errors on the stderr
  • Upgrade go runtime to 1.15.8

5.16.351 - 2021-01-04

Collectord updates:

  • Bug fix: host file inputs can raise a fatal error: concurrent map writes

5.16.350 - 2020-12-14

Supports collectorforkubernetes version 5.16.x and below

  • New dashboard: Collectord metrics
  • Compatibility for Kubernetes 1.20
  • Bug fix: broken link in Allocatable Resources dashboard

Collectord updates:

  • Annotations for collecting prometheus metrics: authorization keys and CAName for SSL certificates
  • Improvement for DNS resolutions of Splunk output FQDN
  • Export internal collectord metrics in Prometheus format
  • Forwarding internal collectord metrics to Splunk
  • For the watch objects inputs being able to hide management fields
  • In the diag include all open file descriptors
  • Upgrade go runtime to 1.14.13
  • Remove \0 symbol from the labels values in the prometheus metrics
  • Allow to filter host logs with blacklist and whitelist
  • Bug fix: less verbose warnings about not being able to load resources from API server
  • Bug fix: performance improvements for Ack DB
  • Bug fix: custom prometheus metrics forwarded by Collectord do not include cluster field or custom user fields
  • Bug fix: addon pod terminates faster
  • Bug fix: verify command trying to post to all outputs with all indexes specified in the configuration
  • Bug fix: crash in AckDB
  • Bug fix: input system stats does not recognize ouputs specified for the host and cgroup
  • Bug fix: verify command runs recursively all the time for host logs even when recursively is set to false

5.15.305 - 2021-01-04

Collectord updates:

  • Upgrade go runtime to 1.14.13
  • Bug fix: host file inputs can raise a fatal error: concurrent map writes

5.15.303 - 2020-08-12

Collectord updates:

  • Upgrade golang to 1.14.7 to fix the hang in runtime

5.15.301 - 2020-06-24

Collectord updates:

  • Bug fix: verify command broken for addon pod

5.15.300 - 2020-06-01

Supports collectorforkubernetes version 5.15.x and below

  • Events dashboard: filters depend on selection of cluster and node labels
  • Support for Kubernetes 1.18+
  • Improvement for alert “Cluster Warning: high number of errors to Kubernetes API” (only alert on 5xx errors)
  • Bug fix: node events aren’t visible in Events tab

Collectord updates:

  • Support for annotations to add custom user fields to data
  • Support for blacklisting and whitelisting Prometheus metrics (significally reducing the indexing cost of data)
  • Verify command improvements - verify proper configurations for cgroup (memory/memory.use_hierarchy is 1)
  • Bug fix: fix bug in prometheus metrics parser, empty fields can be filled with previous fields
  • Bug fix: occasionally addon can report warnings about trying to delete expired keys from ack db
  • Bug fix: better handle of connections to metrics endpoints exported in Prometheus format
  • Bug fix: http connections improvements for when Splunk is unresponsive
  • Bug fix: broken diag

5.14.285 - 2020-08-12

Collectord updates:

  • Upgrade golang to 1.14.7 to fix the hang in runtime

5.14.284 - 2020-03-23

Collectord updates:

  • New annotation to configure whitelist pattern for log messages
  • Allow to override Kubernetes service URL
  • Bug fix: panic in output for addon
  • Bug fix: performance and memory usage improvement for ack db

5.14.280 - 2020-01-27

Supports collectorforkubernetes version 5.14.x and below

  • Logs dashboard: filters depend on selection
  • Overview dashboard: namespace counter for list of projects

Collectord updates:

  • Support templates in the index, source and sourcetype
  • Allow to exclude indexed fields when forwarding to Splunk
  • Support annotation for stats interval for containers
  • Support containerd runtime
  • Bug fix: verify command can show incorrect error about verifying journald input
  • Bug fix: index on namespace should set index for application logs
  • Bug fix: warning about not being able to retrieve node information

5.12.273 - 2019-11-18

Collectord updates:

  • Bug fix: panic in application logs discovering for PVC volumes

5.12.272 - 2019-11-08

Collectord updates:

  • Bug fix: in case when the rotated files are reusing FileID/DevID Collectord stops forwarding rotated files

5.12.271 - 2019-11-07

Supports collectorforkubernetes version 5.12.x and below

  • Improvements for the macros for backward compatibility

Collectord updates:

  • Bug fix: when event pattern is used for joining multi-line events, the error can not be showed if raised by the input in pipeline.
  • Bug fix: reduce warnings failed to get the new event in pipeline - submitted
  • Stability improvements

5.12.270 - 2019-10-22

Supports collectorforkubernetes version 5.12.x and below

  • Compact metrics (pre-calculated on Collectord side)
  • Switched stats for host and cgroup in different macros
  • Use base macro for alerts
  • Improved command extraction for exec in Audit Logs
  • Add cluster name in the alert results

Collectord updates:

  • Watch namespaces and workloads for changes
  • Global configurations with Custom Resources and selectors
  • Describe command to see applied annotations for pods
  • Bug fix: panic when pipe join configuration is removed
  • Bug fix: panic when proc stats is enabled and cgroup stats is disabled
  • Bug fix: support ProxyBasicAuthorization for license server checks
  • Bug fix: Fix for collecting first sample (can show high CPU usage for first sample)
  • Bug fix: if list of URLs is used for Splunk output, the empty URL is still required
  • Beta: dynamic index, source and sourcetype names based on the metafields
  • Beta: cluster diagnostics with one rule: node entropy

5.11.266 - 2020-10-15

Collectord updates:

  • Upgrade golang to 1.14.10 to fix the hang in runtime

5.11.265 - 2020-06-24

Collectord updates:

  • Bug fix: memory improvement for large ackdb files

5.11.264 - 2019-11-08

Collectord updates:

  • Bug fix: in case when the rotated files are reusing FileID/DevID Collectord stops forwarding rotated files

5.11.261 - 2019-09-13

Collectord update:

  • Bug fix: improves discovery for the PVC volumes
  • Bug fix: delay loading for the PVC volumes
  • Bug fix: improves logging for the directory walker

5.11.260 - 2019-09-09

Supports collectorforkubernetes version 5.11.x and below

  • GPU Monitoring (NVIDIA)

Collectord updates:

  • Support for PVC volumes for application logs
  • Bug fix: small memory leak in addon
  • Bug fix: duplicate events then pipeline is getting throttled
  • Bug fix: don’t use throttling for devnull output
  • Bug fix: better recovery for ack db corruption
  • Bug fix: crash on journald input initialization when ack db is corrupted
  • Bug fix: annotations joinmultiline requires joinpartial
  • Bug fix: configurations for stdout only with annotations can crash collectord
  • Set events = 50 by default for Splunk output batches

5.10.255 - 2019-11-20

Collectord updates:

  • Bug fix: better recovery for ack db corruption
  • Bug fix: crash on journald input initialization when ack db is corrupted

5.10.253 - 2019-07-31

Collectord update:

  • Bug fix: collectord can pick up compressed json logs (*.gz)
  • Bug fix: too verbose warnings from the docker watcher about retries

5.10.252 - 2019-07-24

Collectord update:

  • Support for configuring the thruput (general and with annotations for container logs)
  • Support for configuring too old or too new events (general and with annotations for container logs)

5.10.251 - 2019-06-20

Collectord update:

  • Ability to configure Acknowledgement database for collectord.

5.10.250 - 2019-06-18

Supports collectorforkubernetes version 5.10.x and below

  • Security dashboard: Access: access to host via ssh, sudo, exec commands, failed access
  • Security dashboard: Audit (users and namespaces)
  • Security dashboard: Network (traffic)
  • Security dashboard: Network (connections)
  • Security dashboard: Objects (pods) - review pods with host network, age of pods, image pull policy, attached host paths, security context and restart policies
  • Review dashboard: Clusters (allocations and usage)
  • Cluster field filters
  • Base macro for overriding macros for other macros

Collectord updates:

  • Support for volatile and persistent journald storage with default configuration
  • Updated YAML configuration to include most common resources
  • Better support for overriding sourcetype, that does not require to update the Splunk macros
  • Bug fix: rarely when collectord fails to post to HEC it can panic
  • Bug fix: better support for Kubernetes 1.14 and CRI-O storage
  • Bug fix: space characters in index annotations can break the pipeline

5.9.244 - 2019-05-20

Collectord update:

  • Bug fix: support for CRI-O in Kubernetes 1.14

5.9.240 - 2019-05-14

Supports collectorforkubernetes version 5.9.x and below

  • Visual improvements on the graphs for the number of logs and events
  • New alerts for the CPU and Memory reservation

Collectord updates:

  • Support for multiple Splunk destinations (outputs)
  • Support subdomains for annotations (to deploy multiple collectord instances)
  • Support for streaming objects from Kubernetes API to Splunk
  • Bug fix: journald input keeps fd open to the rotated files
  • Bug fix: fix in the annotation parser for the interval annotations
  • Bug fix: fix splunk url selection configuration for multiple splunk URLs

5.8.231 - 2019-04-25

  • Bug fix: Collectord usage report shows trial licenses for all instances

5.8.230 - 2019-04-22

Supports collectorforkubernetes version 5.8.x and below

  • Use multiselect filters for most dashboards and filters with possibility to input custom filters.
  • Reduce dedup usage to improve performance on dashboards.
  • Add critical pod annotations for Kubernetes …1.13, and priority class for Kubernetes 1.14…
  • Fix: statefulset dashboard does not show data with filters.
  • Add graph of number of pods per namespace on Overview dashboard.

Collectord updates:

  • Bug fix: clogging collectord output with errors when incorrect index is used.
  • Bug fix: short lived containers can results in duplicating logs.
  • Bug fix: clogging collectord output with warnings when kernel reports incorrect VmRss size.
  • Bug fix: annotations cannot override timestamp location for fields extraction.
  • Bug fix: verify command reports Journald input in incorrect place.
  • Better support for cgroup symlinks, automatically discover correct location.

5.7.220 - 2019-03-18

Supports collectorforkubernetes version 5.7.x and below

  • Review savedsearches/alerts to support indexing delay (start searches from 2 minutes behind) and run them in more random time.
  • Workload dashboard - change CPU (of host) in table to real CPU
  • Fixed single value memory panel on host dashboard (missed span)
  • Use SEGMENTATION=none for stats events to use less disk space (needs to me moved to indexers)

Collectord updates:

  • Support hostname formatting with environment variables in configuration
  • New rotated file logic uses less file descriptors and frees rotated files quicker
  • Allow to specify a default sampling value for container logs
  • Reimplemented shutdown sequence to stop collectord faster
  • Allow to override sampling percent with annotations
  • New Input: journald

5.6.213 - 2019-03-03

  • Collectord: Fix panic, when collectord does not have access to docker socket, and information about this container does not exist on the disk.

5.6.212 - 2019-02-19

Supports collectorforkubernetes version 5.6.x and below

  • New: Alert: high CPU usage on the host.
  • Fixed: Splunk usage dashboard - charts do not show the data, when the used indexed aren’t searchable by default.
  • New: Support Dark theme.
  • New: Free text search in Logs dashboard.
  • New: Add auto-refresh options to the dashboard.
  • Fixed: Revisited CPU limits and requests for Pods and Containers.
  • New: add CPU Max, Memory Max and Project/Namespace labels to the Review-Namespaces dashboard.
  • Fixed: Show deleted events

Collectord updates:

  • Fixed: auto-recovery from the corrupted write-ahead-log in acknowledgment database.
  • New: support sampling (random and hash-based) for container/application and host logs.
  • New: when running multiple collectord on one host (with different output) - count that as one licensed host, change InstanceID format.
  • Fixed: when container is scheduled with remove flag lock the file till collectord processes it completely.
  • Fixed: collectord reports rare warning about unparsable uint64 max value from proc filesystem.
  • Fixed: collectord reports rare warning about unparsable line from proc/io files.
  • New: allow to include annotations in the forwarding data.
  • Fixed: if collectord cannot access to the API - report the warning less often
  • Fixed: do not report docker warnings for verify command, if there is no container scheduled outside of the Kubernetes.
  • New: splunk output - allow to limit the output batch by the number of events in payload.
  • Fixed: attach namespace labels to the forwarded logs.
  • Fixed: attach openshift_namespace field to the events.

5.5.205 - 2019-01-25

  • Collectord fix: collectord could stop sending container file logs when the original file has been truncated (using the same Node ID as previously used log file).

5.5.203 - 2019-01-25

  • Collectord fix: collectord could send an empty X-Splunk-Request-Channel header to Splunk.

5.5.202 - 2019-01-24

Supports collectorforkubernetes version 5.5.202

  • New: Dashboard Review -> namespaces. Review allocations and requests for namespaces and pods.
  • Fixed: kubernetes_stats_cpu_request_percent - is divided by the number of CPU.

Collectord updates:

  • Fixed: Interval 0 in prometheus input can crash the collectord.
  • Fixed: When both glob and match are set for the application logs, the glob pattern can block the match pattern from finding the files in the volume.

5.4.201 - 2018-12-19

Supports collectorforkubernetes version 5.4.x and below

  • Fixed: Alerts for licenses issued with AWS Subscriptions

Collectord updates:

  • Fixed: Better handling rotated files (less open fd)
  • Fixed: Events input can hang in the err loop.

5.4 - 2018-12-17

Supports collectorforkubernetes version 5.x and below

  • New: CoreDNS dashboard.
  • New: CoreDNS alerts.
  • Improved: etcd metrics representation for bucket values.
  • Compatibility update for collectord 5.4.

Collectord updates:

  • New: Attach EC2 metadata fields
  • New: Basic Auth for Proxy (License Server and Splunk)
  • Fixed: Collectord verify reports CRI-O as unsupported runtime.
  • Fixed: Rare crash on Prometheus metrics definition.
  • Fixed: Better handling of acknowledgment database corruption.
  • Fixed: When handling incorrect indexes, collectord can send index with empty string, that Splunk recognize as incorrect index

5.3 - 2018-11-19

Supports collectorforkubernetes version 5.x and below

  • Fixed: Improved Workload dashboard. Allows to filter by namespace, see all Pods in a specific namespace, filter by workload label.
  • New: Alert for showing when Collectord reports errors in Processing pipelines (as an example if it failed to extract fields).
  • New: Alert for showing when Collectord reports warnings.
  • Fixed: Add node labels filter to Storage Dashboard and Control Plane Dashboards.
  • New: Alert if lag in the indexing of the data.
  • New: Splunk Usage (License usage, number of events) report under Setup.
  • Fixed: adjusted high amount of errors to Kubernetes API dashboard to make it less verbose.
  • Fixed: misprint in the search for showing alerts
  • Fixed: lookup with alerts causing very often replication activities on SHC
  • Fixed: changed search time for few alerts that cause false positives with indexing lag on large installations

Collectord updates:

  • Fixed: high memory usage with Gzip compression enabled (reduced memory usage).
  • New: Allow to disable pipe.join with annotations.
  • Fixed: In high amount of logs (10,000 events per second) Collectord can read lines not in full, that can break JSON logs.
  • Fixed: When collectord writes a Warning that it failed to post to Splunk, it will write a Success message after retry.
  • New: Allow to hash sensitive data with annotations.
  • Fixed: Group network socket tables to reduce the amount of forwarded data (4 times reducing the amount of data)
  • Fixed: Identify when glob and match pattern require recursive directory traversal.
  • Fixed: Make it possible to add annotations for the specific containers inside of the the same Pods.
  • New: Annotation for complete disabling of the handling and forwarding logs for containers.
  • Fixed: Performance improvements for CRI-O logs.
  • Fixed: Collectord showed few Debug messages on start.
  • Fixed: Performance improvements for log forwarding (up to 35% in high amount of logs).
  • Fixed: reduce duplication of the Kubernetes events, forwarded to Splunk.
  • Fixed: Do not generate a WARN when API Server results in 404. Usually this caused by the owner object being deleted.
  • Fixed: Failed to parse proc name from the stat file with the not paired parentheses.

5.2 - 2018-10-15

Supports collectorforkubernetes version 5.x and below

  • New: Review/Storage dashboard based on storage metrics and PVC metrics.
  • New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
  • Fixed: Performance improvements

Collectord updates:

  • New: runtime storage metrics (usage, available, inodes)
  • New: image is built on top of SCRATCH image.
  • New: verify and diag commands for troubleshooting.
  • New: support /dev/null output for logs
  • New: override source/sourcetype and index base on regexp pattern for container logs.
  • Fixed: do not send empty docker_labels
  • New: support docker JSON tags and labels
  • Fixed: allowing a new license to unblock collectord with the expired license.
  • Fixed: Prometheus parser fails to parse metrics with labels that end with a comma.
  • Fixed: Performance improvements
  • New: Prometheus parser supports basic authentication
  • Fixed: Workaround for a bug in HTTP Event Collector, that can return an incorrect index of failed event
  • New: Prometheus autodiscover support host network
  • Fixed: remove node info and limit metadata from logs
  • Fixed: documentation / default configuration update - mount ``/etc/localtime` to allow collectord to use host tz (when not UTC)
  • Fixed: documentation / default configuration update - use dnsPolicy: ClusterFirstWithHostNet for pods mounted on host network

5.1 - 2018-09-17

Supports collectorforkubernetes version 5.x and below

  • New: Network metrics (MB, Packets, Drops and Errors) for host and containers.
  • New: Network socket tables (list of port that containers and hosts are listen on, connections to external resources).
  • New: Network review dashboard to see the list of connection to public services and in private network.
  • Improvement: Replace python-based lookup with macro written with eval.
  • Improvement: Visual improvement for showing when the object was Last Seen (highlighting and showing minutes ago).
  • New: discovering Prometheus metrics in Pods with annotations.
  • New: attaching pod metadata to metrics collected from prometheus metrics exposed from pods.
  • Improvement: Changed source of proc stats to proc root filesystem, to keep minimum list of unique sources.
  • New: Support for Splunk multi-threads outputs (for forwarding more than 3000 events per second).
  • Improvement: Performance improvements for Prometheus parsing.
  • Improvement: Reduce amount of metrics forwarded with proc_stats by excluding system threads.
  • Improvement: Configuration for gzip compression.
  • Improvement: Calculate checksums for first bytes of files, to better identify new files with reused iNode.
  • Bug: Process metrics could be collected 2 times.

5.0 - 2018-09-03

Supports collectorforkubernetes version 5.x and below

  • New dashboard: Events
  • Added events panel to the Workload and Pod dashboards.
  • Labels on Workload and Hosts dashboards.
  • Auto-discover and forward Application logs from host mounts or local volumes.
  • Annotations for containers to change per container configurations (index, source, join rules, replaces and more).
  • Escaping terminal sequences from container logs.
  • Redirecting logs to /dev/null for specific patterns.
  • Replace patterns in container and application logs (hiding sensitive or not important information).
  • Support for extracting fields from the container logs, including timestamps.
  • Include Memory and CPU limits for container lists.
  • Visual updates for the panels, highlighting high CPU and Memory usages
  • Filter cgroup stats, forward only container and host metrics.
  • Support for multiple Splunk HTTP Event Collector endpoints (support fail-over and load-balancing).
  • Handle HTTP Event Collector errors with the incorrect index. Multiple options to redirect to default index, drop or wait.
  • Add retry logic to license client to reduce amount of false positive warnings.
  • Add HTTP read timeouts (handle gateway timeouts, 504).
  • Fixed: fail to parse the latest line in the JSON log.
  • Better error handling incorrect configurations.
  • Deprecating Join rules in favour of annotations.
  • Support for HTTP Event Collector client certificates.
  • Support CRI-O runtime.
  • Fixed: limit directory walkers for depth (fixing issues when directory has a mount to itself)
  • Fixed: add a limit of the maximum line size that collectord can read at once (defaults to 1Mb).
  • Fixed: acknowledgement database stores now NodeID, DevID and a parent folder identifier. That way if NodeID is going to be reused right away - we will identify this file as a new one, if it is in different location.
  • Change: docker_stream field has been renamed to stream for compatibility with other container runtime.
  • Change: prometheus metrics has default sourcetype=kubernetes_prometheus (macro supports backward compatibility)

Upgrade from version 4 to 5

4.0.24 - 2018-05-05

Supports collectorforkubernetes version 4.x and below

  • New dashboard: Cluster/Audit
  • New dashboard: Cluster/Kubernetes API Server
  • New dashboard: Cluster/Kubelet
  • New dashboard: Cluster/etcd
  • New dashboard: Cluster/Scheduler
  • New dashboard: Cluster/Controller Manager.
  • Include image name, when list containers.
  • Added syslog component to the list of host logs.
  • Fixed: Include Daemon Set on Overview dashboard, list of namespaces.

Collectord updates (4.0.171):

  • Collecting metrics from Prometheus format.
  • Add HTTP read timeouts (handle gateway timeouts, 504).
  • Correctly parse HTTP Event Responses when one of few events fail to be indexed (as an example, wrong index).
  • Performance optimizations.
  • Optimize payloads for higher write throughput.
  • Fixed: reduce the number of calls to Kubernetes API Server.
  • Fixed: fail to parse the latest line in the JSON log.
  • Better error handling incorrect configurations.
  • Failed to parse memory limits (Failed to parse memory=000k for the container).
  • Collecting Kubernetes events from the cluster once by using collectord addon.

collectorforkubernetes 4.0.172

  • Fixed: Messages “WARN … proc.go:441: Unparsable line from /rootfs/proc/X/status” caused by new Linux kernel that reports empty line in proc file system.
  • Fixed: Incorrectly parsed Limits for the Kubernetes pods. 5m and 500m both results as 0.500.

collectorforkubernetes 4.0.173

  • Fixed: significant memory usage with the events larger than 512Kb, caused by Splunk issue SPL-156315 (incapable to parse events larger 512Kb, regression in 7.x).

collectorforkubernetes 4.0.174.180730

  • Show the index name in the output, when Splunk reports incorrect index.

3.0.23 - 2018-02-17

Supports collectorforkubernetes version 3.x and below

  • Bug: Memory view on workflow dashboard had a max limit set to 100.
  • Bug: Events view on overview dashboard had a max limit set to 100.

3.0.22 - 2018-02-07

Supports collectorforkubernetes version 3.x and below

  • Added support for containers deployed without Kubernetes.
  • Added CPU Quota, CPU Shares, Throttled and Memory Limit and Request Overlays on Container and Pod Dashboards.
  • Indexing Kubernetes events in sourcetype kubernetes_events
  • Performance improvement on Dashboards by combining multiple charts using one common search.
  • New “Review/Allocatable Resources” dashboard to track limits and requests for CPU and Memory.
  • New “Review/Privileged containers and enabled capabilities” dashboard to list all privileged containers and enabled security capabilities for containers.
  • New Overview dashboard to easy navigate within the application.
  • New Aggregated metrics dashboard for specific Workload.
  • Fixed bug on Process Dashboard, some charts did not filter by host.
  • “Setup: Collectors” now supports collectorforkubernetes images distributed via private registries.
  • “Overview: Process” dashboard did not use Span token for timechart dashboards.
  • “Top: Containers” fixed incorrect memory usage (showed double size)
  • Added alerts in application for notification about outdated collectord versions and expired licenses for collectord.
  • Hide Wait Read/Write IO panels, when this data is not available.
  • In process Dashboard show VmRSS with RssAnon, RssFile, and RssShmem.

Collectord updates:

  • Support for Splunk indexing acknowledgment.
  • Watching for Kubernetes/OpenShift events.
  • HTTP Proxy support for License server and Splunk output.
  • Allow to configure destination indices for different types of data in collectord configuration (stats, logs, host logs, proc stats and events).
  • Handling responses from HTTP Event Collector to skip invalid events (will be logged).
  • If container is running, but Kubernetes does not provide metadata, allow to wait for metadata.
  • Collect security capabilities and uid/gid.
  • For Kubernetes/OpenShift environments recognize containers scheduled outside of Pods and load metadata directly from docker.
  • Support for custom labels, specified with collectord configuration.
  • Support OpenShift/Kubernetes annotations “collectord.io/…” to configure destination indices, sourcetypes and sources for pods, workloads and namespaces.
  • Support for partial logs without join rules.
  • Bug. Use local timezone by default for local syslog files.
  • Bug. Fix small memory leak on deleted containers.
  • Bug. When collectord is failing to send data to Splunk, impossible to stop collectord with terminate.

2.1 - 2017-10-22

Supports collectorforkubernetes version 2.1.59.x and below

  • Implemented collectors dashboard to track number of collectors, their versions and used licenses.
  • Fallback to the process IO statistics when blkio is not available.
  • Fix IO statistic graphs, showed average, when sum should be used.
  • Fields extraction support for nginx ingress 0.9 and above.
  • collectord* - Improved resistance for storage failures.
  • collectord* - License checks reporting.
  • collectord* - Better support for openshift environment (default configuration).

2.0 - 2017-10-22

Supports collectorforkubernetes version 2.0.37.x and below

  • Better labels support in Dashboards. Collectord has a breaking feature, replacing format for labels from kubernetes_node_labels_LABEL1=VALUE1 to kubernetes_node_labels=[LABEL1=VALUE1,LABEL2=VALUE2].
  • Process level metrics.
  • Uptime for hosts and processes.
  • Fields extraction for kubernetes controller manager and scheduler.
  • Fields extraction and support in dashboards for main kubernetes components (setup host logs collection with collectord).
  • New top-like dashboards allow to monitor Hosts/Pods/Containers/Processes in real-time.
  • Rewritten Kubernetes Objects Dashboards with support of Events and Labels.
  • Improved dashboards navigation.
  • Support for host logs.
  • Other bugs and improvements based on user feedback.