From 9f7ee5b53e5cd373298aed5bf8938a38cc03b89c Mon Sep 17 00:00:00 2001 From: ChrsMark Date: Fri, 17 May 2024 10:07:17 +0300 Subject: [PATCH] review improvements Signed-off-by: ChrsMark --- .../index.md | 27 +++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/content/en/blog/2024/otel-collector-container-log-parser/index.md b/content/en/blog/2024/otel-collector-container-log-parser/index.md index bde20a571523..ecec2f893c07 100644 --- a/content/en/blog/2024/otel-collector-container-log-parser/index.md +++ b/content/en/blog/2024/otel-collector-container-log-parser/index.md @@ -19,8 +19,21 @@ Currently, the [filelog receiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/v0.100.0/receiver/filelogreceiver/README.md) is capable of parsing container logs from Kubernetes Pods, but it requires [extensive configuration](https://github.com/open-telemetry/opentelemetry-helm-charts/blob/aaa70bde1bf8bf15fc411282468ac6d2d07f772d/charts/opentelemetry-collector/templates/_config.tpl#L206-L282) -to properly parse logs according to various container runtime formats. This -configuration complexity can be mitigated by using the corresponding +to properly parse logs according to various container runtime formats. The +reason, is that container logs can come in various known formats (according to +the container runtime), and hence in order to properly parse them we need to +perform a specific set of operations: + +1. detect the format of the incoming logs at runtime +2. parse each format accordingly taking into account its format specific + characteristics: define if it's json or plain text and take into the + timestamp format +3. extract known metadata relying on predefined patterns. + +Such advanced sequence of operations can be handled by chaining the proper +[stanza](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/pkg/stanza) +operators together. The end result is rather complex, and this configuration +complexity can be mitigated by using the corresponding [helm chart preset](https://github.com/open-telemetry/opentelemetry-helm-charts/tree/main/charts/opentelemetry-collector#configuration-for-kubernetes-container-logs). However, despite having the preset, it can still be challenging for users to maintain and troubleshoot such advanced configurations. @@ -43,10 +56,15 @@ First of all we need to quickly recall the different container log formats that we can meet out there: - Docker container logs: + `{"log":"INFO: This is a docker log line","stream":"stdout","time":"2024-03-30T08:31:20.545192187Z"}` + - cri-o logs: + `2024-04-13T07:59:37.505201169-05:00 stdout F This is a cri-o log line!` + - Containerd logs: + `2024-04-22T10:27:25.813799277Z stdout F This is an awesome containerd log line!` We can notice that cri-o and containerd log formats are quite similar (both @@ -162,6 +180,11 @@ respective [GitHub issue](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/31959) and its related/linked PR. +Last but not least, we should mention that with the example of the specific +container parser we can notice the room for improvement that exists and how we +could optimize further for popular technologies with known log formats in the +future. + ## Conclusion: container logs parsing is now easier with filelog receiver Eager to learn more about the container parser? Visit the official