Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FluentD errors on a short living pod and seems to lose logging at the error event #3348

Closed
Mosibi opened this issue Apr 23, 2021 · 3 comments

Comments

@Mosibi
Copy link

Mosibi commented Apr 23, 2021

Describe the bug
We test our logging framework after each installation of a cluster (nightly) and often we see that logs for some pods are missing and when that happens, it is often after this error message:

2021-04-23 03:35:55 +0000 [warn]: stat() for /var/log/containers/es-query-nwbx8_test_es-query-4c8788baaa3785e06488d809eeee67a9ab459d9534059a76b402a807651747bc.log failed with ENOENT. Drop tail watcher for now.
2021-04-23 03:35:55.936433049 +0000 fluent.warn: {"message":"stat() for /var/log/containers/es-query-nwbx8_test_es-query-4c8788baaa3785e06488d809eeee67a9ab459d9534059a76b402a807651747bc.log failed with ENOENT. Drop tail watcher for now."}
2021-04-23 03:35:55 +0000 [error]: [in_tail_container_logs] Unexpected error raised. Stopping the timer. title=:in_tail_refresh_watchers error_class=NoMethodError error="undefined method `each_value' for #<Fluent::Plugin::TailInput::TargetInfo:0x00007f7ebd3385b8>\nDid you mean?  each_slice"
2021-04-23 03:35:55.936735140 +0000 fluent.error: {"title":"in_tail_refresh_watchers","error":"#<NoMethodError: undefined method `each_value' for #<Fluent::Plugin::TailInput::TargetInfo:0x00007f7ebd3385b8>\nDid you mean?  each_slice>","message":"[in_tail_container_logs] Unexpected error raised. Stopping the timer. title=:in_tail_refresh_watchers error_class=NoMethodError error=\"undefined method `each_value' for #<Fluent::Plugin::TailInput::TargetInfo:0x00007f7ebd3385b8>\\nDid you mean?  each_slice\""}
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin/in_tail.rb:428:in `stop_watchers'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin/in_tail.rb:422:in `rescue in block in start_watchers'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin/in_tail.rb:416:in `block in start_watchers'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin/in_tail.rb:396:in `each_value'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin/in_tail.rb:396:in `start_watchers'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin/in_tail.rb:359:in `refresh_watchers'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin_helper/timer.rb:80:in `on_timer'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/cool.io-1.7.0/lib/cool.io/loop.rb:88:in `run_once'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/cool.io-1.7.0/lib/cool.io/loop.rb:88:in `run'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin_helper/event_loop.rb:93:in `block in start'
  2021-04-23 03:35:55 +0000 [error]: /usr/local/bundle/gems/fluentd-1.12.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2021-04-23 03:35:55 +0000 [error]: [in_tail_container_logs] Timer detached. title=:in_tail_refresh_watchers
2021-04-23 03:35:55.938884400 +0000 fluent.error: {"title":"in_tail_refresh_watchers","message":"[in_tail_container_logs] Timer detached. title=:in_tail_refresh_watchers"}

When this happens, logs from a job/pod which has run before this event, is not present in Elasticsearch, so it seems that FluentD is still processing those and stops when this error occurs

To Reproduce
Run a job which generates logging and when that job is done, run a job (pod) that checks if the generated logging is present in Elasticsearch

Expected behavior
No loss of logging events

Your Environment

# fluentd --version
fluentd 1.12.2

The container is running within kubernetes, running on RHEL 8.2

@ashie
Copy link
Member

ashie commented Apr 23, 2021

It seems same issue with #3327, should be fixed by v1.12.3.
I've released it just a while ago.
Please try it.

@ashie ashie closed this as completed Apr 23, 2021
@ashie
Copy link
Member

ashie commented Apr 23, 2021

I've released it just a while ago.

Docker image isn't released yet though.

@Mosibi
Copy link
Author

Mosibi commented Apr 23, 2021

@ashie Thanks, i've just build our own version of the container and coming night we will do a full test with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants