Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If the fluentd aggregator is down then client fluentd does not start #1280

Closed
rolele opened this issue Oct 17, 2016 · 1 comment
Closed

If the fluentd aggregator is down then client fluentd does not start #1280

rolele opened this issue Oct 17, 2016 · 1 comment

Comments

@rolele
Copy link

rolele commented Oct 17, 2016

I am using the latest fluentd docker image.
I pretty much have the same infra as this without ssl

if the aggregator is down I can not start my fluentd clients

[root@manager0 vagrant]# docker logs 3edae51d2704
2016-10-17 10:58:39 +0000 [info]: reading config file path="/fluentd/etc/fluent.conf"
2016-10-17 10:58:39 +0000 [info]: starting fluentd-0.12.29
2016-10-17 10:58:39 +0000 [info]: gem 'fluentd' version '0.12.29'
2016-10-17 10:58:39 +0000 [info]: adding match pattern="*.*" type="copy"
2016-10-17 10:58:39 +0000 [error]: unexpected error error="getaddrinfo: Name does not resolve"
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:504:in `getaddrinfo'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:504:in `resolve_dns!'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:490:in `resolved_host'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:462:in `initialize'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:146:in `new'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:146:in `block in configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:121:in `each'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_forward.rb:121:in `configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_copy.rb:48:in `block in configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_copy.rb:39:in `each'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/plugin/out_copy.rb:39:in `configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/agent.rb:133:in `add_match'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/agent.rb:64:in `block in configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/agent.rb:57:in `each'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/agent.rb:57:in `configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/root_agent.rb:86:in `configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/engine.rb:129:in `configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/engine.rb:103:in `run_configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:489:in `run_configure'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:160:in `block in start'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:366:in `main_process'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:339:in `block in supervise'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:338:in `fork'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:338:in `supervise'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/supervisor.rb:156:in `start'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/lib/fluent/command/fluentd.rb:173:in `<top (required)>'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
  2016-10-17 10:58:39 +0000 [error]: /usr/lib/ruby/gems/2.3.0/gems/fluentd-0.12.29/bin/fluentd:5:in `<top (required)>'
  2016-10-17 10:58:39 +0000 [error]: /usr/bin/fluentd:23:in `load'
  2016-10-17 10:58:39 +0000 [error]: /usr/bin/fluentd:23:in `<main>'
2016-10-17 10:58:39 +0000 [info]: process finished code=256
2016-10-17 10:58:39 +0000 [warn]: process died within 1 second. exit.

the fluentd clients

<source>
  type forward
  port 24224
  bind 0.0.0.0
</source>

<match *.*>
  type copy
  <store>
    type forward
    <server>
      host {{fluentd_aggregator_name}}
      port 24225
    </server>
  </store>
  <store>
    type stdout
  </store>
</match>

the problem happen with or without the copy plugin.
What I am expecting is that fluentd starts normally (even if it can not connect to the endpoint),

  • the output to stdout still works fine
  • and the connection to my aggregator is retried periodically (and start working when possible).
    Instead; what really happened is that fluentd exited with code 256.

Am I doing something wrong?
Exiting fluentd because one endpoint is down is a little too risky for production.

@repeatedly
Copy link
Member

What I am expecting is that fluentd start, the output to stdout still happen and the connection to my aggregator is retried periodically.

out_forward retries buffer flush when the error happens during plugin running phase.
In your case, error happens during configuration phase, not plugin running phase.
This behaviour is for safety bacause Fluentd can't judge actual errors and hard to recover it.

We have one specific patch for out_forward, but no merged yet: #735

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants