-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add data streams telemetry device #1296
Conversation
555afc0
to
0530607
Compare
Sorry for the conflict with the black/isort pull request. I fixed the conflicts in my fork, see pquentin@518f2e7 (I squashed all your commits into a single one). |
With this commit we add a data streams telemetry device that regularly samples the count and store size of all data streams within a cluster. Closes elastic#1161
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. I left some comments but I think it make sense that @gingerwizard has a look at this from a functional perspective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating! The changes look good to me but let's wait for feedback from @gingerwizard for a more user-focussed perspective.
Functionally this is fine for a first pass but due to limitations is probably not going to be used for replacing existing custom data stream collection yet. Areas for thought:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment. LGTM as first pass.
Our goal should be to reduce custom functionality as much as possible. If this PR is not there yet, we should iterate with the goal of being able to replace custom solutions. I'm not in favor of merging something that does not address this.
I don't know whether it's possible to derive it, but assuming it's possible, isn't it sufficient to have one property for the total store size (incl. replicas) and one without? I also don't understand why we would want a per index view on a data stream? As this is a sampling telemetry device, this also leads to a significant increase in the number of metrics documents which we should be very cautious about.
Can you elaborate the reasons why the mapping would change in a benchmark? Can we make the simplifying assumption that it does not change? |
We had an offline conversation about this and it makes sense to merge this as is as a first step. Based on our experience we can refine this iteratively. |
With this commit we add a data streams telemetry device that regularly
samples the count and store size of all data streams within a cluster.
Closes #1161
Tested locally with this test track (that uses my Weatherbeat data, invoked with these commands for both in-memory and external metrics stores:
Also tested with
telemetry-params.json
: