Monitoring Kubernetes - Release History
- Monitoring Kubernetes - Release History
- 5.6.212 - 2019-02-19
5.6.212 - 2019-02-19
Requires collectorforkubernetes version 5.6.212 or above
- New: Alert: high CPU usage on the host.
- Fixed: Splunk usage dashboard - charts do not show the data, when the used indexed aren't searchable by default.
- New: Support Dark theme.
- New: Free text search in Logs dashboard.
- New: Add auto-refresh options to the dashboard.
- Fixed: Revisited CPU limits and requests for Pods and Containers.
- New: add CPU Max, Memory Max and Project/Namespace labels to the Review-Namespaces dashboard.
- Fixed: Show deleted events
Collectord updates:
- Fixed: auto-recovery from the corrupted write-ahead-log in acknowledgment database.
- New: support sampling (random and hash-based) for container/application and host logs.
- New: when running multiple collectord on one host (with different output) - count that as one licensed host, change InstanceID format.
- Fixed: when container is scheduled with remove flag lock the file till collectord processes it completely.
- Fixed: collectord reports rare warning about unparsable uint64 max value from proc filesystem.
- Fixed: collectord reports rare warning about unparsable line from proc/io files.
- New: allow to include annotations in the forwarding data.
- Fixed: if collectord cannot access to the API - report the warning less often
- Fixed: do not report docker warnings for verify command, if there is no container scheduled outside of the Kubernetes.
- New: splunk output - allow to limit the output batch by the number of events in payload.
- Fixed: attach namespace labels to the forwarded logs.
- Fixed: attach openshift_namespace field to the events.
5.5.205 - 2019-01-25
- Collectord fix: collectord could stop sending container file logs when the original file has been truncated (using the same Node ID as previously used log file).
5.5.203 - 2019-01-25
- Collectord fix: collectord could send an empty
X-Splunk-Request-Channel
header to Splunk.
5.5.202 - 2019-01-24
Requires collectorforkubernetes version 5.5.202
- New: Dashboard Review -> namespaces. Review allocations and requests for namespaces and pods.
- Fixed: kubernetes_stats_cpu_request_percent - is divided by the number of CPU.
Collectord updates:
- Fixed: Interval 0 in prometheus input can crash the collectord.
- Fixed: When both glob and match are set for the application logs, the glob pattern can block the match pattern from finding the files in the volume.
5.4.201 - 2018-12-19
Requires collectorforkubernetes version 5.4.201 or above
- Fixed: Alerts for licenses issued with AWS Subscriptions
Collectord updates:
- Fixed: Better handling rotated files (less open fd)
- Fixed: Events input can hang in the err loop.
5.4 - 2018-12-17
Requires collectorforkubernetes version 5.4 or above
- New: CoreDNS dashboard.
- New: CoreDNS alerts.
- Improved: etcd metrics representation for bucket values.
- Compatibility update for collectord 5.4.
Collectord updates:
- New: Attach EC2 metadata fields
- New: Basic Auth for Proxy (License Server and Splunk)
- Fixed: Collectord verify reports CRI-O as unsupported runtime.
- Fixed: Rare crash on Prometheus metrics definition.
- Fixed: Better handling of acknowledgment database corruption.
- Fixed: When handling incorrect indexes, collectord can send index with empty string, that Splunk recognize as incorrect index
5.3 - 2018-11-19
Requires collectorforkubernetes version 5.3 or above
- Fixed: Improved Workload dashboard. Allows to filter by namespace, see all Pods in a specific namespace, filter by workload label.
- New: Alert for showing when Collectord reports errors in Processing pipelines (as an example if it failed to extract fields).
- New: Alert for showing when Collectord reports warnings.
- Fixed: Add node labels filter to Storage Dashboard and Control Plane Dashboards.
- New: Alert if lag in the indexing of the data.
- New: Splunk Usage (License usage, number of events) report under Setup.
- Fixed: adjusted high amount of errors to Kubernetes API dashboard to make it less verbose.
- Fixed: misprint in the search for showing alerts
- Fixed: lookup with alerts causing very often replication activities on SHC
- Fixed: changed search time for few alerts that cause false positives with indexing lag on large installations
Collectord updates:
- Fixed: high memory usage with Gzip compression enabled (reduced memory usage).
- New: Allow to disable pipe.join with annotations.
- Fixed: In high amount of logs (10,000 events per second) Collectord can read lines not in full, that can break JSON logs.
- Fixed: When collectord writes a Warning that it failed to post to Splunk, it will write a Success message after retry.
- New: Allow to hash sensitive data with annotations.
- Fixed: Group network socket tables to reduce the amount of forwarded data (4 times reducing the amount of data)
- Fixed: Identify when glob and match pattern require recursive directory traversal.
- Fixed: Make it possible to add annotations for the specific containers inside of the the same Pods.
- New: Annotation for complete disabling of the handling and forwarding logs for containers.
- Fixed: Performance improvements for CRI-O logs.
- Fixed: Collectord showed few Debug messages on start.
- Fixed: Performance improvements for log forwarding (up to 35% in high amount of logs).
- Fixed: reduce duplication of the Kubernetes events, forwarded to Splunk.
- Fixed: Do not generate a WARN when API Server results in 404. Usually this caused by the owner object being deleted.
- Fixed: Failed to parse proc name from the stat file with the not paired parentheses.
5.2 - 2018-10-15
Requires collectorforkubernetes version 5.2 or above
- New: Review/Storage dashboard based on storage metrics and PVC metrics.
- New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
- Fixed: Performance improvements
Collector updates:
- New: runtime storage metrics (usage, available, inodes)
- New: image is built on top of
SCRATCH
image. - New:
verify
anddiag
commands for troubleshooting. - New: support
/dev/null
output for logs - New: override source/sourcetype and index base on regexp pattern for container logs.
- Fixed: do not send empty docker_labels
- New: support docker JSON tags and labels
- Fixed: allowing a new license to unblock collector with the expired license.
- Fixed: Prometheus parser fails to parse metrics with labels that end with a comma.
- Fixed: Performance improvements
- New: Prometheus parser supports basic authentication
- Fixed: Workaround for a bug in HTTP Event Collector, that can return an incorrect index of failed event
- New: Prometheus autodiscover support host network
- Fixed: remove node info and limit metadata from logs
- Fixed: documentation / default configuration update - mount
`/etc/localtime
to allow collector to use host tz (when not UTC) - Fixed: documentation / default configuration update - use
dnsPolicy: ClusterFirstWithHostNet
for pods mounted on host network
5.1 - 2018-09-17
Requires collectorforkubernetes version 5.1 or above
- New: Network metrics (MB, Packets, Drops and Errors) for host and containers.
- New: Network socket tables (list of port that containers and hosts are listen on, connections to external resources).
- New: Network review dashboard to see the list of connection to public services and in private network.
- Improvement: Replace python-based lookup with macro written with eval.
- Improvement: Visual improvement for showing when the object was Last Seen (highlighting and showing minutes ago).
- New: discovering Prometheus metrics in Pods with annotations.
- New: attaching pod metadata to metrics collected from prometheus metrics exposed from pods.
- Improvement: Changed source of proc stats to proc root filesystem, to keep minimum list of unique sources.
- New: Support for Splunk multi-threads outputs (for forwarding more than 3000 events per second).
- Improvement: Performance improvements for Prometheus parsing.
- Improvement: Reduce amount of metrics forwarded with proc_stats by excluding system threads.
- Improvement: Configuration for gzip compression.
- Improvement: Calculate checksums for first bytes of files, to better identify new files with reused iNode.
- Bug: Process metrics could be collected 2 times.
5.0 - 2018-09-03
Requires collectorforkubernetes version 5.0 or above
- New dashboard: Events
- Added events panel to the Workload and Pod dashboards.
- Labels on Workload and Hosts dashboards.
- Auto-discover and forward Application logs from host mounts or local volumes.
- Annotations for containers to change per container configurations (index, source, join rules, replaces and more).
- Escaping terminal sequences from container logs.
- Redirecting logs to /dev/null for specific patterns.
- Replace patterns in container and application logs (hiding sensitive or not important information).
- Support for extracting fields from the container logs, including timestamps.
- Include Memory and CPU limits for container lists.
- Visual updates for the panels, highlighting high CPU and Memory usages
- Filter cgroup stats, forward only container and host metrics.
- Support for multiple Splunk HTTP Event Collector endpoints (support fail-over and load-balancing).
- Handle HTTP Event Collector errors with the incorrect index. Multiple options to redirect to default index, drop or wait.
- Add retry logic to license client to reduce amount of false positive warnings.
- Add HTTP read timeouts (handle gateway timeouts, 504).
- Fixed: fail to parse the latest line in the JSON log.
- Better error handling incorrect configurations.
- Deprecating Join rules in favour of annotations.
- Support for HTTP Event Collector client certificates.
- Support CRI-O runtime.
- Fixed: limit directory walkers for depth (fixing issues when directory has a mount to itself)
- Fixed: add a limit of the maximum line size that collector can read at once (defaults to 1Mb).
- Fixed: acknowledgement database stores now NodeID, DevID and a parent folder identifier. That way if NodeID is going to be reused right away - we will identify this file as a new one, if it is in different location.
- Change:
docker_stream
field has been renamed tostream
for compatibility with other container runtime. - Change: prometheus metrics has default sourcetype=kubernetes_prometheus (macro supports backward compatibility)
4.0.24 - 2018-05-05
Requires collectorforkubernetes version 4.0 or above
- New dashboard: Cluster/Audit
- New dashboard: Cluster/Kubernetes API Server
- New dashboard: Cluster/Kubelet
- New dashboard: Cluster/etcd
- New dashboard: Cluster/Scheduler
- New dashboard: Cluster/Controller Manager.
- Include image name, when list containers.
- Added syslog component to the list of host logs.
- Fixed: Include Daemon Set on Overview dashboard, list of namespaces.
Collector updates (4.0.171):
- Collecting metrics from Prometheus format.
- Add HTTP read timeouts (handle gateway timeouts, 504).
- Correctly parse HTTP Event Responses when one of few events fail to be indexed (as an example, wrong index).
- Performance optimizations.
- Optimize payloads for higher write throughput.
- Fixed: reduce the number of calls to Kubernetes API Server.
- Fixed: fail to parse the latest line in the JSON log.
- Better error handling incorrect configurations.
- Failed to parse memory limits (Failed to parse memory=000k for the container).
- Collecting Kubernetes events from the cluster once by using collector addon.
collectorforkubernetes 4.0.172
- Fixed: Messages "WARN ... proc.go:441: Unparsable line from /rootfs/proc/X/status" caused by new Linux kernel that reports empty line in proc file system.
- Fixed: Incorrectly parsed Limits for the Kubernetes pods.
5m
and500m
both results as0.500
.
collectorforkubernetes 4.0.173
- Fixed: significant memory usage with the events larger than 512Kb, caused by Splunk issue SPL-156315 (incapable to parse events larger 512Kb, regression in 7.x).
collectorforkubernetes 4.0.174.180730
- Show the index name in the output, when Splunk reports incorrect index.
3.0.23 - 2018-02-17
Requires collectorforkubernetes version 3.0 or above
- Bug: Memory view on workflow dashboard had a max limit set to 100.
- Bug: Events view on overview dashboard had a max limit set to 100.
3.0.22 - 2018-02-07
Requires collectorforkubernetes version 3.0 or above
- Added support for containers deployed without Kubernetes.
- Added CPU Quota, CPU Shares, Throttled and Memory Limit and Request Overlays on Container and Pod Dashboards.
- Indexing Kubernetes events in sourcetype kubernetes_events
- Performance improvement on Dashboards by combining multiple charts using one common search.
- New "Review/Allocatable Resources" dashboard to track limits and requests for CPU and Memory.
- New "Review/Privileged containers and enabled capabilities" dashboard to list all privileged containers and enabled security capabilities for containers.
- New Overview dashboard to easy navigate within the application.
- New Aggregated metrics dashboard for specific Workload.
- Fixed bug on Process Dashboard, some charts did not filter by host.
- "Setup: Collectors" now supports collectorforkubernetes images distributed via private registries.
- "Overview: Process" dashboard did not use Span token for timechart dashboards.
- "Top: Containers" fixed incorrect memory usage (showed double size)
- Added alerts in application for notification about outdated collector versions and expired licenses for collector.
- Hide Wait Read/Write IO panels, when this data is not available.
- In process Dashboard show VmRSS with RssAnon, RssFile, and RssShmem.
Collector updates:
- Support for Splunk indexing acknowledgment.
- Watching for Kubernetes/OpenShift events.
- HTTP Proxy support for License server and Splunk output.
- Allow to configure destination indices for different types of data in collector configuration (stats, logs, host logs, proc stats and events).
- Handling responses from HTTP Event Collector to skip invalid events (will be logged).
- If container is running, but Kubernetes does not provide metadata, allow to wait for metadata.
- Collect security capabilities and uid/gid.
- For Kubernetes/OpenShift environments recognize containers scheduled outside of Pods and load metadata directly from docker.
- Support for custom labels, specified with collector configuration.
- Support OpenShift/Kubernetes annotations "collectord.io/..." to configure destination indices, sourcetypes and sources for pods, workloads and namespaces.
- Support for partial logs without join rules.
- Bug. Use local timezone by default for local syslog files.
- Bug. Fix small memory leak on deleted containers.
- Bug. When collector is failing to send data to Splunk, impossible to stop collector with terminate.
2.1 - 2017-10-22
Requires collectorforkubernetes version 2.1.59.171210 or above
- Implemented collectors dashboard to track number of collectors, their versions and used licenses.
- Fallback to the process IO statistics when blkio is not available.
- Fix IO statistic graphs, showed average, when sum should be used.
- Fields extraction support for nginx ingress 0.9 and above.
- collector* - Improved resistance for storage failures.
- collector* - License checks reporting.
- collector* - Better support for openshift environment (default configuration).
2.0 - 2017-10-22
Requires collectorforkubernetes version 2.0.37.171023 or above
- Better labels support in Dashboards.
Collector has a breaking feature, replacing format for labels from
kubernetes_node_labels_LABEL1=VALUE1
tokubernetes_node_labels=[LABEL1=VALUE1,LABEL2=VALUE2]
. - Process level metrics.
- Uptime for hosts and processes.
- Fields extraction for kubernetes controller manager and scheduler.
- Fields extraction and support in dashboards for main kubernetes components (setup host logs collection with collector).
- New top-like dashboards allow to monitor Hosts/Pods/Containers/Processes in real-time.
- Rewritten Kubernetes Objects Dashboards with support of Events and Labels.
- Improved dashboards navigation.
- Support for host logs.
- Other bugs and improvements based on user feedback.