You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Graylog has this neat feature where it can "stream" logs in real time, with a specified refresh interval:
If this is being used against a query that returns a lot of results, a lot of search contexts are opened on each of the nodes that are being queried. If the refresh interval is really small, these might not get cleared before the query is repeated, putting a lot of stress on the nodes.
We ran a query that returned 6M results every two seconds and this is the effect on our 35 data nodes ES cluster:
This correlates with massive CPU spikes across all of the data nodes:
The "streaming" of logs should not cause performance degradation on ES's side.
Possible Solution
Graylog should limit the number of search contexts that it opens if the previous ones were not closed.
It would be great if Graylog would allow disabling / overriding the option to "stream" logs from the config file.
Alternative solutions would be allowing to disable that option from Search Configuration settings in the UI, or even change the time interval options from there (this is currently possible for Surrounding Timeranges and Relative Timeranges)
Steps to Reproduce
Create a "fat" index set containing at least 6M log lines and attach a stream to it.
Run a query that returns all of the 6M log lines every 2 seconds.
Observe the number of open search contexts and the resource usage on ElasticSearch
Context
Graylog seems to be causing damage to ES if there's a pattern of high usage and abusive queries.
There's no way to mitigate this unless we update ElasticSearch, but even then, we'll only be able to limit the number of open search contexts, which means ES will error out when this is reached.
Your Environment
Graylog Version: 2.5.0
Elasticsearch Version: 6.3.0
MongoDB Version: 4.0.6
Operating System: server - Amazon Linux, client - macOS Mojave
Browser version: Google Chrome | 72.0.3626.121 (Official Build) (64-bit)
The text was updated successfully, but these errors were encountered:
tanasegabriel
changed the title
RefreshState can cause performance degradation on ES data nodes
RefreshStore can cause performance degradation on ES data nodes
Mar 15, 2019
Current Behavior
Graylog has this neat feature where it can "stream" logs in real time, with a specified refresh interval:
data:image/s3,"s3://crabby-images/80588/80588b9d727f065c61e8a20d3de0f5a4448a8409" alt="image"
If this is being used against a query that returns a lot of results, a lot of search contexts are opened on each of the nodes that are being queried. If the refresh interval is really small, these might not get cleared before the query is repeated, putting a lot of stress on the nodes.
We ran a query that returned 6M results every two seconds and this is the effect on our 35 data nodes ES cluster:
This correlates with massive CPU spikes across all of the data nodes:
data:image/s3,"s3://crabby-images/833d1/833d147db6a857f8ed805fbc384410bc0cc54f74" alt="image"
Elastic acknowledged this issue and added a soft limit for the maximum number of open search contexts. However, this was only released in version 6.6.0.
Expected Behavior
The "streaming" of logs should not cause performance degradation on ES's side.
Possible Solution
Graylog should limit the number of search contexts that it opens if the previous ones were not closed.
It would be great if Graylog would allow disabling / overriding the option to "stream" logs from the config file.
Alternative solutions would be allowing to disable that option from
Search Configuration
settings in the UI, or even change the time interval options from there (this is currently possible forSurrounding Timeranges
andRelative Timeranges
)Steps to Reproduce
Context
Graylog seems to be causing damage to ES if there's a pattern of high usage and abusive queries.
There's no way to mitigate this unless we update ElasticSearch, but even then, we'll only be able to limit the number of open search contexts, which means ES will error out when this is reached.
Your Environment
2.5.0
6.3.0
4.0.6
Amazon Linux
, client -macOS Mojave
Google Chrome | 72.0.3626.121 (Official Build) (64-bit)
The text was updated successfully, but these errors were encountered: