Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogStash::Pipeline when collecting metrics in the pipeline global metric populates the different metrics #7724

Open
jsvd opened this issue Jul 18, 2017 · 10 comments

Comments

@jsvd
Copy link
Member

jsvd commented Jul 18, 2017

Examples:

https://travis-ci.org/elastic/logstash/jobs/254799125#L7518-L7544

https://logstash-ci.elastic.co/job/elastic+logstash+master+multijob-unix-compatibility/os=fedora/26/console

Failures:

  1) LogStash::Pipeline when collecting metrics in the pipeline global metric populates the different metrics
     Failure/Error: end.to be_truthy

       expected: truthy value
            got: false
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/handler.rb:13:in `block in handle_matcher'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/handler.rb:10:in `block in handle_matcher'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/handler.rb:9:in `handle_matcher'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/target.rb:30:in `block in to'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/target.rb:44:in `block in with_wait'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait.rb:28:in `with_wait'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/target.rb:44:in `with_wait'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait/target.rb:30:in `to'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/logstash-core/spec/logstash/pipeline_spec.rb:805:in `block in (root)'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait.rb:46:in `block in /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/vendor/bundle/jruby/2.3.0/gems/rspec-wait-0.0.9/lib/rspec/wait.rb'
     # /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/lib/bootstrap/rspec.rb:13:in `<main>'

Finished in 6 minutes 39 seconds (files took 12.53 seconds to load)
3051 examples, 1 failure, 3 pending

Failed examples:

rspec /var/lib/jenkins/workspace/elastic+logstash+master+multijob-unix-compatibility/os/fedora/logstash-core/spec/logstash/pipeline_spec.rb:816 # LogStash::Pipeline when collecting metrics in the pipeline global metric populates the different metrics

Randomized with seed 27058
@jsvd
Copy link
Member Author

jsvd commented Jul 18, 2017

@jakelandis since we had a few recent changes in the metrics subsystem, can I ask you to take a look at this?

@jakelandis
Copy link
Contributor

@jsvd - Taking a look.

@jakelandis
Copy link
Contributor

I could not reproduce locally, and the failures appear intermittent on our build systems.

It seem like when it does error, the error stems from the first test run. I will add retry logic to these tests.

For example:
GOOD:

when collecting metrics in the pipeline
  global metric
    populates the different metrics
  pipelines
    populates the pipelines core metrics
    populates the filter metrics
    populates the output metrics
    populates the name of the output plugin
    populates the name of the filter plugin

BAD

 when collecting metrics in the pipeline
   pipelines
     populates the name of the output plugin (FAILED - 1)
     populates the output metrics
     populates the name of the filter plugin
     populates the filter metrics
     populates the pipelines core metrics
     when dlq is enabled
       should show dlq stats
     when dlq is disabled
       should show not show any dlq stats
   global metric
     populates the different metrics

ANOTHER BAD:

when collecting metrics in the pipeline
	global metric
	  populates the different metrics (FAILED - 1)
	pipelines
	  populates the filter metrics
	  populates the name of the filter plugin
	  populates the output metrics
	  populates the pipelines core metrics
	  populates the name of the output plugin

@jsvd
Copy link
Member Author

jsvd commented Jul 18, 2017

looking at the line where the error happens, pipeline_spec.rb:805, the failure comes from the before :each block, so I don't think retrying the test themselves will help here.

@jakelandis
Copy link
Contributor

@jsvd - Thanks! I missed that. Explains why the tests still intermittently fail even with the retry.

@colinsurprenant
Copy link
Contributor

I am reopening - this re-pop'ed in 6.0 build. Was the fix backported in 6 ?

https://logstash-ci.elastic.co/job/elastic+logstash+6.0+multijob-unix-compatibility/os=amazon/26/console

@jakelandis
Copy link
Contributor

jakelandis commented Aug 17, 2017

@colinsurprenant - It looks like it a25d329 was missed for 6.0 and 6.x

I will see if Jarvis can merge #7728 to 6.0 and 6.x

EDIT: Nope ... will manually backpatch.

@jakelandis
Copy link
Contributor

@colinsurprenant - I was wrong it is in 6.0/6.x. It appears this change was in before the 6.0 feature freeze, so there was no need to backport.

The fix was just adding more retries, so it was not much of an actual fix.

@colinsurprenant
Copy link
Contributor

suggestions? ignore? more retries?

@jsvd
Copy link
Member Author

jsvd commented Feb 4, 2022

This is still happening occasionally even with the Java Pipeline, preceded by error indicading that workers failed to initialize:

19:25:14 [2022-02-03T19:25:14,146][ERROR][logstash.javapipeline ] Pipeline error {:pipeline_id=>"main", :exception=>#<RuntimeError: Some worker(s) were not correctly initialized>, :backtrace=>["/opt/logstash/logstash-core/lib/logstash/java_pipeline.rb:288:in start_workers'", "/opt/logstash/logstash-core/lib/logstash/java_pipeline.rb:189:in run'", "/opt/logstash/logstash-core/lib/logstash/java_pipeline.rb:141:in `block in start'"], "pipeline.sources"=>[], :thread=>"#<Thread:0x67965a80@/opt/logstash/logstash-core/spec/logstash/java_pipeline_spec.rb:1148 run>"}

19:30:14       1) LogStash::JavaPipeline when collecting metrics in the pipeline global metric populates the different metrics
19:30:14          Failure/Error: Unable to find matching line from backtrace
19:30:14          
19:30:14            expected: truthy value
19:30:14                 got: false
19:30:14 
19:30:14     Finished in 10 minutes 8 seconds (files took 11.8 seconds to load)
19:30:14     3056 examples, 1 failure, 21 pending
19:30:14 
19:30:14     Failed examples:
19:30:14 
19:30:14     rspec ./logstash-core/spec/logstash/java_pipeline_spec.rb:1167 # LogStash::JavaPipeline when collecting metrics in the pipeline global metric populates the different metrics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants