Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hot shard metrics #4365
Hot shard metrics #4365
Changes from 10 commits
0045fb7
ee44c22
68766b7
bfc7002
82abfa4
f38f142
c037475
9543764
859950e
e283bfb
c43f1a1
dc5eb2a
217cb96
bf205a4
b8293dc
27ec4c5
df989eb
6aa2e7f
f960396
3c409ce
fe2955d
f412f60
3caed40
d64ff92
e91a0ee
db4db37
c256fb1
98f2b66
f3dfa3d
fc62af7
33448c8
5a94697
e179473
e0ce388
e48fcec
d3b20f1
c42a921
b6fd862
0bd90b0
0911a11
df5f722
937c0a9
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd want to use a ring buffer instead of a linked list. The code is a little more complicated (unless we find a library) but the memory usage patterns will be much nicer. How many of these do we expect to exist at once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by a ring buffer? My understanding is that a circular linked list is an example of a ring buffer. Are you looking for something that can do dynamic resizing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think David is referring to use an array to implement the ring instead of using a linked-list, that way we can scan a block of memory instead of chasing pointers.
Also, I think unless we have a good idea about the cap on
maxBufferSize
, it might be tricky to get this right to not have overflows. An alternative approach that I can think of is to have a ring buffer where each item represents a unit of time. The size of the buffer will be our window size (in unit of time); e.g., if we want the moving average to calculate the last 60 seconds, we can have a buffer of size 60 where each item in the buffer represents the average for one second with asum
andcount
and we can keep track of the aggregated average. This can be a bit tricky to implement though since it requires a bit of book-keeping when recording and calculating the average. We can leave this for later, only if this becomes a problem.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of completeness, I wrote an array-based implementation and in our bench test it preformed nearly the same (about 7ns/op slower than ring-based). If you want to see it for yourself, it's in commit
98f2b66b9b7a711c427d5f55de82bc55d135ddfc
I see what you're saying about the contiguous memory of an array being preferable, but I'm not convinced this optimization is necessary at this point.At the moment we are only using this to track averages for persistence latency and persistence error rate, so I do not expect to have a large number of them in memory at once. And for those metrics, we do not need to guarantee the perfect
maxBufferSize
to fit all observations in our time window; it just needs to be sufficiently large that we get an accurate picture of persistence health.For now I'm going to leave it as is, unless there are strong feelings to the contrary. If we need to track other kinds of averages in the future, it may be worth it to revisit this and store elements based on time as Saman suggests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, it all depends on the pattern of how the data structure is accessed. In the benchmark for instance, since we are calling
Average
after eachRecord
which means there is only 1 element of array that might be expired after callingAverage
. So there is little benefit since there is no contagious scanning of the elements. If you change the benchmark to callAverage
every 10th times, array has 20% lower overhead. (45 ns/op vs 60 ns/op in my experiments).So depending on what we expect the call pattern to be, this might or might not add much benefits. There is also the overhead of synchronization (mutex) which can mask the benefits when this is being updated concurrently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I mean an array with head and tail indexes. Specifically something that can't do dynamic resizing, unlike the linked list.
I think this is one of those situations where microbenchmarks are misleading: if the whole thing fits in L1 and you're using it from a single thread, then chasing pointers is mostly free and everything will perform about the same. When either or both of those aren't true then the performance will start to diverge.
In general we have an io-bound service so you can say cpu optimizations don't matter, which is true up to a point. If we have just a few of these, then I agree. I was assuming we'd have one per namespace or more, where it'd be a bigger concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay I have been convinced; I swapped the linked list implementation for the array one. Seems there is no reason not to since we will most likely get better performance.