[filebeat] Wrong "gc_overhead" parsed by Elasticsearch module #9513

tsouza · 2018-12-12T19:05:21Z

Version: 6.5.3

The Elasticsearch Filebeat module will parse the following GC log message:

[2018-12-07T16:10:41,612][WARN ][o.e.m.j.JvmGcMonitorService] [node1] [gc][238436] overhead, spent [650ms] collecting in the last [1s]

Into a document that has:

{
  "elasticsearch": {
    "server": {
      "component": "o.e.m.j.JvmGcMonitorService",
      "gc_overhead": "238436"
    },
    "node": {
      "name": "node1"
    }
  }
}

There are a couple of issues here:

The value for field elasticsearch.server.gc_overhead is coming from [gc][238436] overhead. According to this https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/monitor/jvm/JvmGcMonitorService.java#L309 the number 238436 is just a sequential number that holds no significant meaning in GC timings. This is not the GC overhead
It does not collects the two important time components from this log message present in: spent [650ms] collecting in the last [1s]

The text was updated successfully, but these errors were encountered:

BobBlank12 · 2018-12-12T19:58:50Z

This was created in filebeat version 6.4.3

elasticmachine · 2018-12-12T20:18:48Z

Pinging @elastic/infrastructure

elasticmachine · 2018-12-17T14:23:18Z

Pinging @elastic/stack-monitoring

ycombinator · 2018-12-17T23:38:07Z

@tsouza Thanks for reporting this. I'm working on a fix now.

When looking at the parsed document for spent [650ms] collecting in the last [1s], what field names would make most sense to you for the 650ms and 1s values?

ycombinator · 2018-12-18T00:29:32Z

@tsouza Regarding my question in the previous comment, let's continue the conversation on the PR (#9603) where I made up some strawman field names.

Resolves #9513. This PR: * removes the incorrectly-parsed `gc_overhead` field. Turns out what we were parsing was actually an insignificant sequential number, not GC overhead, * parses out a new `gc.collection_duration` field, e.g. `1.2s`, which is the time spent performing GC, and * parses out a new `gc.observation_duration` field, e.g. `1.8s`, which is the overall time over which GC collection was performed It also splits up the long grok expression in the ingest pipeline into smaller patterns and references those patterns, hopefully for easier readability.

tsouza added bug module Filebeat Filebeat labels Dec 12, 2018

ph added the Team:Integrations Label for the Integrations team label Dec 12, 2018

alvarolobato added Feature:Stack Monitoring and removed Team:Integrations Label for the Integrations team labels Dec 17, 2018

ycombinator self-assigned this Dec 17, 2018

ycombinator mentioned this issue Dec 18, 2018

Fixes parsing of GC entries in elasticsearch server log #9603

Merged

ycombinator closed this as completed in #9603 Dec 27, 2018

ycombinator mentioned this issue Dec 27, 2018

Cherry-pick #9603 to 6.x: Fixes parsing of GC entries in elasticsearch server log #9810

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[filebeat] Wrong "gc_overhead" parsed by Elasticsearch module #9513

[filebeat] Wrong "gc_overhead" parsed by Elasticsearch module #9513

tsouza commented Dec 12, 2018

BobBlank12 commented Dec 12, 2018

elasticmachine commented Dec 12, 2018

elasticmachine commented Dec 17, 2018

ycombinator commented Dec 17, 2018

ycombinator commented Dec 18, 2018

[filebeat] Wrong "gc_overhead" parsed by Elasticsearch module #9513

[filebeat] Wrong "gc_overhead" parsed by Elasticsearch module #9513

Comments

tsouza commented Dec 12, 2018

BobBlank12 commented Dec 12, 2018

elasticmachine commented Dec 12, 2018

elasticmachine commented Dec 17, 2018

ycombinator commented Dec 17, 2018

ycombinator commented Dec 18, 2018