Splunk fields extraction for container logs
For the container logs forwarded by the collectord, it is possible to specify field extraction rules specific to image names, container names, or a combination of them.
All container logs have a source format that includes container ID, container name, image name, pod name, namespace, and stream.
1/openshift/{openshift_container_id}/{openshift_container_name}/{openshift_image_name}/{openshift_pod_name}/{openshift_namespace}.{docker_stream}
Using this knowledge, you can create field extraction rules for a specific image or container, including glob patterns,
using wildcards and ... for skipping multiple parts of the path.
As an example, you can specify field extraction for an nginx container in props.conf using a wildcard character for
the container ID, container name, and Docker stream. This field extraction applies to all containers created from
the nginx Docker image.
1[source::/openshift/.../nginx:*/*/*]
2EXTRACT-nginx-ingress-controller-http = ^(?P<remote_addr>[^ ]+)\s+\-\s+\[(?P<proxy_add_x_forwarded_for>[^\]]+)\]\s+\-\s+(?P<remote_user>[^ ]+)\s+\[(?P<time_local>[^\]]+)[^"\n]*"(?P<request>[^"]+)"\s+(?P<status>\d+)\s+(?P<body_bytes_sent>\d+)\s+"(?P<http_referer>[^"]+)"\s+"(?P<http_user_agent>[^"]+)"\s+(?P<request_length>\d+)\s+(?P<request_time>[^ ]+)\s+\[(?P<proxy_upstream_name>[^\]]+)]\s+(?P<upstream_addr>[^\s]+)\s+(?P<upstream_response_length>\d+)\s+(?P<upstream_response_time>[^\s]+)\s+(?P<upstream_status>\d+)$</code></pre>
You can also override source and source type with annotations. See Splunk Indexes.