Add data streams telemetry device #1296

b-deam · 2021-06-30T01:40:05Z

With this commit we add a data streams telemetry device that regularly
samples the count and store size of all data streams within a cluster.

Closes #1161

Tested locally with this test track (that uses my Weatherbeat data, invoked with these commands for both in-memory and external metrics stores:

# This should fail due to major version < 7
esrally race --distribution-version=6.3.0 --car="4gheap,x-pack-security" --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180,use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password'" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes 

# This should fail due to minor version < 7.9.0
esrally race --distribution-version=7.8.0 --car="4gheap,x-pack-security" --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180,use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password'" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

# This should fail due to OSS distribution 
esrally race --distribution-version=7.9.0 --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

esrally race --distribution-version=7.9.0 --car="4gheap,x-pack-security" --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180,use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password'" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

esrally race --distribution-version=7.10.0 --car="4gheap,x-pack-security" --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180,use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password'" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

esrally race --distribution-version=7.11.0 --car="4gheap,x-pack-security" --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180,use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password'" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

# Version 7.11+ no longer needs x-pack explicitly defined due to no OSS build distribution
esrally race --distribution-version=7.11.0 --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

esrally race --distribution-version=7.12.0 --track=rally/tracks/weatherbeat --track-params=rally/tracks/weatherbeat/params.json --client-options="timeout:180" --telemetry data-stream-stats --telemetry-params="data-stream-stats-sample-interval: 1" --kill-running-processes

# Tested with params file
esrally race --distribution-version=7.11.0 --car="4gheap,x-pack-security" --track=/Users/bradleydeam/perf/onboarding/rally/tracks/weatherbeat --track-params=/Users/bradleydeam/perf/onboarding/rally/tracks/weatherbeat/params.json --client-options="timeout:180,use_ssl:true,verify_certs:false,basic_auth_user:'rally',basic_auth_password:'rally-password'" --telemetry data-stream-stats --telemetry-params=/Users/bradleydeam/perf/onboarding/rally/tracks/weatherbeat/telemetry-params.json --kill-running-processes

Also tested with telemetry-params.json:

{
    "data-stream-stats-sample-interval": 5
}

pquentin · 2021-06-30T13:17:27Z

Sorry for the conflict with the black/isort pull request. I fixed the conflicts in my fork, see pquentin@518f2e7 (I squashed all your commits into a single one).

With this commit we add a data streams telemetry device that regularly samples the count and store size of all data streams within a cluster. Closes elastic#1161

danielmitterdorfer

Thanks for this. I left some comments but I think it make sense that @gingerwizard has a look at this from a functional perspective.

esrally/telemetry.py

danielmitterdorfer

Thanks for iterating! The changes look good to me but let's wait for feedback from @gingerwizard for a more user-focussed perspective.

gingerwizard · 2021-07-05T16:20:15Z

Functionally this is fine for a first pass but due to limitations is probably not going to be used for replacing existing custom data stream collection yet.

Areas for thought:

The store_size_bytes includes replicas I assume? If so, id like to understand the primary store cost if possible. I appreciate that the mapping could potentially change across indices, for the lifetime of the datastream, which complicates this. This leads to (2) therefore. As a first pass maybe a replica and primary count, however? The ratio can in turn be used to calculate the primary size.
You note the number of indices but not the size of each. I wonder if a doc per index of the data stream would make more sense - this would make visualizing a little more challenging unless you add a common key that denoted collection per unit time per datastream/index for aggregating.
The only other statistic which would be useful would be the min and max of the date within the data stream (and each index). A low priority, however.

gingerwizard

see comment. LGTM as first pass.

danielmitterdorfer · 2021-07-06T07:18:08Z

Functionally this is fine for a first pass but due to limitations is probably not going to be used for replacing existing custom data stream collection yet.

Our goal should be to reduce custom functionality as much as possible. If this PR is not there yet, we should iterate with the goal of being able to replace custom solutions. I'm not in favor of merging something that does not address this.

The store_size_bytes includes replicas I assume? If so, id like to understand the primary store cost if possible.

I don't know whether it's possible to derive it, but assuming it's possible, isn't it sufficient to have one property for the total store size (incl. replicas) and one without? I also don't understand why we would want a per index view on a data stream? As this is a sampling telemetry device, this also leads to a significant increase in the number of metrics documents which we should be very cautious about.

I appreciate that the mapping could potentially change across indices, for the lifetime of the datastream

Can you elaborate the reasons why the mapping would change in a benchmark? Can we make the simplifying assumption that it does not change?

danielmitterdorfer · 2021-07-13T13:57:12Z

We had an offline conversation about this and it makes sense to merge this as is as a first step. Based on our experience we can refine this iteratively.

b-deam self-assigned this Jun 30, 2021

b-deam added :Telemetry Telemetry Devices that gather additional metrics enhancement Improves the status quo labels Jun 30, 2021

b-deam requested a review from danielmitterdorfer June 30, 2021 01:40

b-deam added this to the 2.3.0 milestone Jun 30, 2021

b-deam force-pushed the datastreams-stats branch from 555afc0 to 0530607 Compare June 30, 2021 03:36

b-deam added 6 commits July 1, 2021 10:19

Add data streams telemetry device

ea8292a

With this commit we add a data streams telemetry device that regularly samples the count and store size of all data streams within a cluster. Closes elastic#1161

Add test and check for distribution build flavor

5a35533

Fix docs formatting

e725d02

Add extra field to data stream stats metrics doc

99068d5

Fix unit tests

862ecfc

Fix linting

2cbc6dd

b-deam force-pushed the datastreams-stats branch from 0530607 to 2cbc6dd Compare July 1, 2021 01:03

danielmitterdorfer reviewed Jul 1, 2021

View reviewed changes

esrally/telemetry.py Outdated Show resolved Hide resolved

esrally/telemetry.py Outdated Show resolved Hide resolved

danielmitterdorfer requested a review from gingerwizard July 1, 2021 11:43

b-deam added 3 commits July 5, 2021 09:32

Merge remote-tracking branch 'upstream/master' into datastreams-stats

1092d4a

Address Daniel's comments

9e02ca5

Refactor and fix linting

3fa9b30

danielmitterdorfer approved these changes Jul 5, 2021

View reviewed changes

gingerwizard reviewed Jul 5, 2021

View reviewed changes

b-deam merged commit 9b6eb28 into elastic:master Jul 13, 2021

michaelbaamonde mentioned this pull request Oct 12, 2021

Ensure that data streams are fully supported #1359

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add data streams telemetry device #1296

Add data streams telemetry device #1296

b-deam commented Jun 30, 2021 •

edited

Loading

pquentin commented Jun 30, 2021

danielmitterdorfer left a comment

danielmitterdorfer left a comment

gingerwizard commented Jul 5, 2021

gingerwizard left a comment

danielmitterdorfer commented Jul 6, 2021

danielmitterdorfer commented Jul 13, 2021

Add data streams telemetry device #1296

Add data streams telemetry device #1296

Conversation

b-deam commented Jun 30, 2021 • edited Loading

pquentin commented Jun 30, 2021

danielmitterdorfer left a comment

Choose a reason for hiding this comment

danielmitterdorfer left a comment

Choose a reason for hiding this comment

gingerwizard commented Jul 5, 2021

gingerwizard left a comment

Choose a reason for hiding this comment

danielmitterdorfer commented Jul 6, 2021

danielmitterdorfer commented Jul 13, 2021

b-deam commented Jun 30, 2021 •

edited

Loading