-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Onyx optimizations #88
Conversation
`isCollectionKey(mapping.key)` and `isSafeEvictionKey(mapping.key)` are not working correctly in this case since neither `onyxKeys` or `safeEvictionKeys` are set This also helps other logic running during init, as connections would wait a bit and allow the rest of the app to initialize
This covers both `set` and `merge`
…AccessedKeys It seems `recentlyAccessedKeys` were not allow to contain non safe eviction keys because of the `evictStorageAndRetry` logic But the logic there can be update to filter out "unsafe" keys Then `recentlyAccessedKeys` can do what it was meant to do and be a list of accessed keys sorted in access order This allows to clean dated keys from cache Keys are added to the recent access list during retrieval instead of during connect All read request happen through that place Added a configurable limit of 150 keys in cache This looks like a sane value from observations during chat browsing Dated items with active connections aren't removed from cache for better optimization The behavior can be switched off by providing maxCachedKeysCount=0 Then the cache will be handled as before -> remove keys that we're no longer connected to
The reading times do improve but the overall performance gets worse It seems too much data is batched at once when Onyx sends it to the subscribers there's a brief period where the app is unresponsive
Instead of resolving ahead of time `Promise.resolve()` - capture the write promise This way the promise resolves one time - when the queue for this key finish The main benefit here is that the true timing of the merge is captured
The variable `defaultKeyStates` is already preserved in memory for the life of the application, there's no need to keep it in hard storage as well This not only saves us a write, but instead we do a multiGet to merge any data from storage to our defaults This well we've prefilled the cache with some storage data that is immediately needed
There's not much to be seen in a visual comparison. I expected to see the LHN much faster, but it seems Onyx is not the main bottleneck ATM SBS Before/After Android Physical DeviceBefore_After_SbS.mp4
Overall I can feel the app is a bit smoother cc @quinthar |
Here's the E.cash PR where you can build and try out the changes: Expensify/App#4040 on E.cash |
if (isCollectionKey(key) || !isSafeEvictionKey(key)) { | ||
if (isCollectionKey(key)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was updated to allow to keep more items in the access list
To preserve the same functionality the check was moved here: Line 481
const keyForRemoval = _.find(recentlyAccessedKeys, key => !evictionBlocklist[key]); | ||
const keyForRemoval = _.find( | ||
recentlyAccessedKeys, | ||
key => !evictionBlocklist[key] && isSafeEvictionKey(key), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check isSafeEvictionKey
was moved here to allow to use recentlyAccessedKeys
to actually put all recently accessed keys to that list
Hey @kidroca, Thanks for the changes + metrics! Super exciting stuff! Before diving into any reviews I'd like to focus on one change at a time to make them easier to evaluate. It seems like there's a mix of refactoring, cleanup, and performance stuff going on - which is great to see, but makes it difficult to get a sense for the impact of each individual change + also harder still to test and track any regressions if they pop up. Let's create PRs for each of these individual changes, review each one in isolation, and discuss anything that requires further explanation in that PR. It would also be good to prioritize the PRs based on which ones have the largest impacts on performance so we can get those out sooner than later. Specific Notes / Questions:
Sounds like these two could maybe be moved into a separate PR that intends to clean up the timing stuff. Also sounds like this would be a lower priority since there is no direct impact on performance.
Can you elaborate on breathing space? Is that another way of saying there was a performance improvement associated with this?
Slightly confused about this. If we have a value in the cache why would we assume that we don't need to make a write? To clarify, are you proposing we remove the ability to call
How is this related to performance? Seems like maybe it will help with memory consumption? I think this would be a good change to evaluate separate from the other changes so we can look at impacts on memory. But let me know if there's some other benefit here.
This would also be good to see as a separate PR. I'm not sure what the default behavior of this is now, but surprised that it is taking up a significant amount of time since, as you mentioned, there are only 4 keys. |
These are things that would help anyone making benchmarks and using the
Yes, this carries performance improvements
Quite a few times we're trying to "overwrite" something that didn't change, the most notable one is When we Lines 515 to 517 in 86be759
Even if this happens via other means like mergeCollection - if cache has a value it's always the most recent up to date valueSo when we try to write a value, but we have a cached key and the exact same value in cache we can skip the writec5a4ba6 I've tested with a log in the condition here and very often we try to write the same value for modal , network , iou , session
This helps on a few fronts
I've disabled clearing cache entirely and it improved speeds, then I came up with this LRU update, which gives pretty much the same performance while still clears cache when it can
Yes I was surprised too, but it was a repeating pattern from the numerous benches that I did (Android) - these 4 keys would start fetching at 1.5sec after launch and resolve at the 3rd or 4th second after init - ~2sec to retrieve them I'm ready to start opening PRs, just need to know whether I should start with the |
Good point, let's start with that then!
Ok I'm not too sure I get what we want to do here yet, but sounds like it should be it's own PR and we can evaluate the impact on app start time?
Nice. I agree it doesn't make sense to write data that hasn't changed. The part I was missing from this proposal is how should we tell the difference between the data that has changed and the data that has not changed? Looking at the commit you shared and it seems like we'd use This makes sense to me, but it seems like there might be some cases where we know this data has changed and are going to perform an additional check on it now? e.g. merging some large JSON blob would now require that we perform this equality check on it in addition to saving it to storage.
Ok so sounds like we can maybe start with that first optimization and then add this one in after?
Nice. It sounds like maybe this improvement can be combined with the deferred connection logic since they are both focusing on the app start time? Let me know if you are good with this game plan... PR1 - Add the new metrics improvements so that we can use them in the benchmarking the rest of the changes |
Yes that works
The check happens really fast - we're dealing with objects available in memory, and we're not computing the full diff between the objects the check is over as soon the first difference is found. |
Sounds great, thanks |
cc @marcaaron @tgolen
Details
This PR contains a few optimizations and refactors
Onyx.printMetrics
function. You can use it as a base to compare before/after results. It prints a general table on the console which is easy to copy and put in excel.Onyx.init
is called. This should not be allowed as theinitialKeyStates
are applied during init. Added a deferred tasks that would queue incoming connections and only make them after init, this also gives the rest of the app init some breathing spaceset
to check with cache and skip writing to storage. Cache reflects what's already on disc, if we have the exact value in cache it means it didn't change and we don't need to make a writecleanCache
function. The function removes dated items from cache, if the item is still being used by a component it's kept in cache. By default it will try to keep up to ~150 keys in cache, can be a bit more due to active connections to old itemsmerge
was refactored so that it's promises are resolved when a queue for a key is done. This helps with measuring the function more accuratelydefaultKeyStates
handling. Merging these keys one by one was taking a significant amount of time, even though they are only 4. Instead of writing them to storage, we read data from storage and merge it along with the defaults to the cache. Then we never allow for thedefaultKeys
to be removed from cacheRelated Issues
Expensify/App#2667
Automated Tests
Added and updated tests related to changes in cache handling.
The rest of the changes are covered by the existing tests
Linked PRs
TBD
Benchmark results
There are overall improvements that can be seen in the stats here
https://docs.google.com/spreadsheets/d/1kHRvFL1ITbr4p9TzT_nsmtvCY4KoPwKCXp84FTCsGzk/edit?usp=sharing
Android
iOS