-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements for Onyx connect #63
Comments
Awesome summary and issue! Thanks for this! Heads up, I think we are trying to avoid a memory proxy for as long as we can. There have been concerns in the past about difficulty in keeping things in sync. But I have thought about this before in the past. I'd be curious to see some basic benchmarks for before and after applying a |
I'd bet on Let's imagine for a moment that all of the Onyx data is in memory and can be accessed instantly But due to the fact that @marcaaron we both had the chance to see the report ID coming much faster when it was provided from react navigation. waiting from the prop from onyx would only come in one of the later component updates: Let simplify what was going on before the updates to the RecordsScreen:
So initially I thought that |
You are keeping things in sync anyway. E.g. Remote Data -> App State -> Rendered content Or ⏬ -------------------- ⏫ It doesn't matter whether Local Data Layer is in memory or not it still needs to be in sync.
|
Regarding benchmarks I could do the following: I can open a branch (onyx fork) and apply some metric loggings/collecting without changing anything Then we can use the current expensify.cash app to benchmark things by pointing it to the forked package of Onyx |
Hi! Coming in here to weigh in on this issue. My biggest concern is that there is no defined problem. The only problem that looks to be defined here is:
This is very ambiguous and means there are lots of solutions and ideas being discussed which have no clear value and is completely unknown if they solve this problem. I think all the improvements make sense theoretically, but we should only make changes when there is a clear problem.
Maybe this is the real problem? Let's find out if it is. I like the idea of doing some benchmarking on different platforms to see what the problem really is first before skipping to discussing possible solutions. |
Ok I think I see what's going on here. I've pinpointed this 0.5 - 1s. delay when I was working on Expensify/App#2221 (comment) See what a huge delay there was when you switch to a different conversation
Android.Report.Switch.mp4Luckily we can get the report ID from the navigation and this happens instantly between steps (3-4) above. And this is how the issue was solved. At that time I've added a timestamp on each prop change - the change coming from onyx would come about 0.5 - 1s after the change coming from navigation route props. This was way out of scope so I just posted a comment about it. I've also brought it up in other discussions too: https://expensify.slack.com/archives/C01GTK53T8Q/p1617734370399900 I can't track this down ATM, but it was mentioned there were startup issues on Android where you'd see a blank screen for a long time and then suddenly the App will pickup. I think this is related, but I haven't made tests as it would have taken me a lot of time. The App seems to start a lot slower on Android. Tracing the Onyx.connect code was quicker. Initially I thought there was delay due to reading data, well there certainly is, but the above issue was actually a result of waiting to write data and then informing listeners of the change. And then it turns out |
Keeping a dictionary (a store) in memory would definitely make updates faster, but I don't think it's would be a responsibility of AsyncStorage. There would be many cases where this would be an unnecessary overhead:
This means it probably won't be built into AsyncStorage, it might be a wrapper that they or someone else provides Fixing the Android performance inside AsyncStorage to match iOS might be a do over. Or it might simply be down to specs. Improving how Onyx works seems easier |
FYI, added some discussion here -- these feel like sorta the same discussion, so perhaps we should merge these two issues? |
In particular I want to cross-post this comment:
|
I've addressed the Benchmark reasons in the other thread. I'll add some details into the performance side here
Let's imagine a scenario like retrieving session information (auth token, user email, etc) When caching is on the native level and particularly the SQLite level:
All these layers would cost extra memory and processing. It would not be worth it as a caching solution for a key/value that takes less than a kilobyte to store in JS When caching is on the JS level and particularly in Onyx
So addressing cache as early as possible has the most benefits - it again saves executing unnecessary operations, while the data would take the same amount of space. You could also compress the string with up to 5:1 compression ratio. Though when you render it on screen it would take the full size + the zipped size where it originally came from |
To clarify, AsyncStorage already uses SQLite on Android, but its implementation is very inefficient. I'm just suggesting we contribute an optimization to AsynchStorage on Android to:
It wouldn't involve "bundling" SQLite, just changing how AsyncStorage currently uses it. It wouldn't change the external interface of AsyncStorage, it would just hopefully make it faster and more efficient for everyone. Does that make sense? can you see a reason why we wouldn't do this optimization with AsyncStorage, whether or not we add a cache inside of Onyx? |
I know SQLite is used internally and won't be bundled. My points are regarding caching (memory mapping) and still stand:
None of the storage solutions neither in iOS or on Android implement any kind of cache themselves and this might be related to point 3 Regarding "Use prepared statements" and "Implement multiGet() with a single query" this makes sense to me and probably deserves a pursue, but I do think Onyx would benefit much more with it's own caching, so much that these would be negligible Regarding This one is official coming from sqlite.org: "Many Small Queries Are Efficient In SQLite": https://sqlite.org/np1queryprob.html
Now this might not be entirely true on a mobile device, but it can be researched further You'd think that I work for AsyncStorage but I don't 😉 |
To clarify, I'm not suggesting adding overhead to AsyncStorage; I'm suggesting we optimize it to be faster for all users. I don't see a single downsize for any use case of using prepared statements; it's the "right way" to use SQLite for this kind of thing, and AsyncStorage just isn't doing it right. Nobody benefits by AsyncStorage being slow, even if I agree that React Native is just not widely used, and those who do use it aren't performance sensitive.
To clarify, this isn't adding a RAM cache -- it's just enabling AsyncStorage to use the existing file cache as a RAM cache. Without memory mapping, sqlite is forced to make a bunch more copies of data; by enabling memory mapping, it just reduces the number of copies required. It doesn't increase the amount of memory used, it reduces the amount of memory used. I'm not aware of a single disadvantage to anybody by using memory mapping. If it were already written this way, I can't think of any scenario in which you'd actively choose to disable the memory map. Again, nobody actively benefits by AsyncStorage being slow. That said, the way you enable the memory map is by executing an SQL |
This comment has been minimized.
This comment has been minimized.
I had a chance to dig a bit more the iOS side of things. We've focused a lot on Android, but it turns out iOS is faster simply because it already uses in memory cache. The iOS storage implementation is a lot simpler than Android. Everything is stored in a text file and read as string as needed. To improve performance a 2MB cache is used: RCTGetCachestatic NSCache *RCTGetCache()
{
// We want all instances to share the same cache since they will be reading/writing the same
// files.
static NSCache *cache;
static dispatch_once_t onceToken;
dispatch_once(&onceToken, ^{
cache = [NSCache new];
cache.totalCostLimit = 2 * 1024 * 1024; // 2MB
});
return cache;
} The when a key is read if there's a value in cache it's directly returned _getValueForKey- (NSString *)_getValueForKey:(NSString *)key errorOut:(NSDictionary **)errorOut
{
NSString *value =
_manifest[key]; // nil means missing, null means there may be a data file, else: NSString
if (value == (id)kCFNull) {
// read from cache first
value = [RCTGetCache() objectForKey:key];
if (!value) { // No cached value -> read from file
NSString *filePath = [self _filePathForKey:key];
value = RCTReadFile(filePath, key, errorOut);
}
}
return value;
} Otherwise data is read from file RCTReadFileBasically iOS is doing what I suggested for Onyx, with the difference that when caching is done in the Onyx layer - values won't be cached as strings, and there would be no overhead of constantly parsing strings to objects, which is costing more execution time and more memory - once for the raw string value and once for the string parsed to object. Also there's a fixed limit of 2MB, though it should be plenty most of the time, there might be a slowdown if Onyx starts using more than 2MB of memory The same solution can be implemented for Android, if your interested in addressing it in that level. As I've already said the faster (for react-native) would be to handle this on the JS level and prevent any expensive operations right from the start - crossing the bridge, involve deeper native operations, stringification... I'm surprised AsyncStorage haven't thought of that themselves, or maybe they did and have a reason not to implement the cache on the JS level where it can be reused by both iOS and Android. Would you like to open a ticket on that matter on their repo? |
Wow, very interesting find, thanks! That's pretty compelling. Some questions:
|
I agree with 1) though it might be more cost effective to just address this in JS
This could still be a part of AsyncStorage, and it would be nice if it can be configurable - use / don't use caching Regarding 2) it's primarily due to i. "is often called with the same key simultaneously many times", but there are other reasons too
|
To me, this sounds like two completely separate issues.
I would love for us to contribute to AsyncStorage to get near-identical
performance across all platforms (after all, it's SUPPOSED to be
cross-platform). I also don't think the caching should be configurable
(because it's not configurable now with iOS and it seems to be fine).
We can still make optimizations with Onyx if we want to (when it becomes a
problem).
While I agree that it is more cost-effective in the short-term to just
address this by updating Onyx to overcome deficiencies in AsyncStorage, I
don't think that is a good enough rationale for putting a more long-term
and proper fix in place. Also, the cost isn't an issue for us as we will
pay for the updates to AsyncStorage.
…On Thu, Apr 29, 2021 at 9:08 AM Peter Velkov ***@***.***> wrote:
I agree with 1) though it might be more cost effective to just address
this in JS
- covers both iOS and Android and any future native platforms
- does not have to cross the native bridge for cached usages
This could still be a part of AsyncStorage, and it would be nice if it can
be configurable - use / don't use caching
Otherwise if cache is on the native level all those separate get key calls
would still flood the native bridge
Regarding 2) it's primarily due to i. "is often called with the same key
simultaneously many times", but there are other reasons too
- Making many key retrievals at once (mostly during init, but not
solely there) instead of get all, get from cache or batching
- Always have to parse retrieved data - even when we just did - more
CPU time and memory. Cache in Onyx means you have a direct referential link
to an object instead of a serialized string. Cache in Async storage means
you're still dealing with strings first
- Onyx code becomes simpler - for example merge we don't have to wait
to retrieve the value and then merge data to it and save it again. Every
write operation updates app state instantly and then a transaction is just
queued to update storage. Since writes are queued you can save even more
native bridge crossings or at least optimize so this crossing happens
during an idle time, but you cannot have a write queue without cache in Onyx
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#63 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJMABZWAV7MTB6JG7N5WADTLFY53ANCNFSM425MZY6A>
.
|
I thought it was clear that this can still be addressed in the JS level not in Onyx but in
My reasoning is that people that are having their own memory cache like with redux and Apollo should be able to opt out of this
|
Ah, OK. Thanks for clarifying those points.
…On Thu, Apr 29, 2021 at 11:40 AM Peter Velkov ***@***.***> wrote:
I thought it was clear that this can still be addressed in the JS level
not in Onyx but in AsyncStorage
- covers both iOS and Android and any future native platforms
- does not have to cross the native bridge for cached usages
This could still be a part of AsyncStorage
------------------------------
I also don't think the caching should be configurable
(because it's not configurable now with iOS and it seems to be fine)
My reasoning is that people that are having their own memory cache like
with redux and Apollo should be able to opt out of this
- In slack the suggested cache size is 10MB. I want to be able to
decide whether I need this or not
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#63 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJMAB5SVXIWQBCWB2UNPVLTLGK2VANCNFSM425MZY6A>
.
|
@kidroca given this:
Don't we already have an in-memory cache of this data then, in the form of the dom? It would seem that if we want to avoid re-reading the same value that is already connected from this, we should simply detect when someone tries to connect to a key that is already connected to, and then copy the state object that we already have in memory. We already have an in-memory cache in onyx, and it's called the dom. Can't we just reuse that? |
No, you just have it in memory.
I guess by DOM you mean the React Virtual DOM. That's private and even if you find a way to access it it's subject to change without notice. So the next best thing is to store a reference to stuff you put in that VDOM. I think that's what you mean by
And how do you suggest to mange which keys are already connected to? Right now you don't have a single state object in memory, but each individual connection/component with it's own private state. You can try to find one of the connections for the key you're interested and retrieve the value, but that's messy so you'll need to implement a place that will be the single source of truth Remember the Dictionary from above, what if it stored the most recent value for connected keys -> now you have cache
Do you mean "copy" as create a duplicate each time a key is retrieved instead of reuse through referential value? |
I agree with that if it was only about making Onyx 5% or 10% faster, but what I'm talking about are improvements both in simplicity and efficiency that would make Onyx a 100 times faster no matter how the underlying storage layer works ( |
In memory cache is simple to implement (and still takes the same amount of memory you already use, as pointed out by @quinthar ) You can still search for a way around it, but in the meantime see how simple it would be to update Onyx: const cache = new Map();
const keyRetrievalTasks = new Map();
const writeQueue = [];
/**
* Returns a Promise that resolves with the requested key value
*/
function get(key) {
const value = cache.get(key);
// When we have cache quickly resolve with that
if (value) return Promise.resolve(value);
// When we're already retriving the data from disk, hook to the same call
if (keyRetrievalTasks.has(key)) return keyRetrievalTasks.get(key);
// Capture a promise so others seeking the same key can get it at the same time
const task = getFromStorage(key);
keyRetrievalTasks.set(key, task);
task
// Fill cache after reading from disk
.then(data => cache.set(key, data))
// Remove the task after it completes
.finally(() => keyRetrievalTasks.remove(key));
return task;
}
The cache logic is just this: const value = cache.get(key);
// When we have cache quickly resolve with that
if (value) return Promise.resolve(value);
// ...
cache.set(key, data); The rest is an optimization for when the same key is retrieved multiple times while already reading from disk - something that needs addressing in any case Could you implement the same reading data from the React tree/state? Probably, but:
I won't delve into the |
I have made some extensive tests capturing only the performance of Onyx.get Current Onyx.get performanceTested on Emulators and Physical devices Sample for switching chats on a Physical Android device
Onyx.get performance after suggested changesJust the modification to The same test as above
The changes I've made to capture these metricsChanges to
|
Wow!!! Those are some insane results! |
What's suggested in slack is a similar improvement. But there are some complications you might run into with that approach Keeping a map of connections by keyThis is fine, though each key will hold a list of connections. The instances in the list would all have the same value, they are just different connections to the same key by different components. So you just need the first element of that list... Retrieving the value from
|
Peter, thank you for all this data and information. It's going to be super
helpful as we decide the best path forward!
…On Tue, May 4, 2021 at 9:10 AM Peter Velkov ***@***.***> wrote:
What's suggested in slack
<https://expensify.slack.com/archives/C01GTK53T8Q/p1620065578311900> is a
similar improvement.
But there are some complications you might run into with that approach
Keeping a map of connections by key
This is fine, though each key will hold a *list* of connections. The
instances in the list would all have the same value, they are just
different connections to the same key by different components. So you just
need the first element of that list...
Retrieving the value from withOnyxInstance.state
This is the real problem with this solution. state is a component state.
When you use it inside the get() method you have no guarantee something
is not about to call setState and update it. So you'll need to cover this
with additional logic that cannot be put inside the get() method.
- by storing a ref as early as possible my solution always have access
to the latest value without extra effort
Multiple calls for the same value would still initiate multiple disk reads
for the same key
When no existing connection has the needed value like during init, and
multiple components are requesting this value, several separate disk reads
are initiated, they might happen in parallel, but they are still causing n
times bridge flooding at the most intensive time for the app
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#63 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJMAB3R5LLG64Z3IQEVWV3TMAE65ANCNFSM425MZY6A>
.
|
Right now Onyx works fine on iOS
On Android though not so much, we can blame it on AsyncStorage, but there are
ways to improve the usage of AsyncStorage
Querying
AsyncStorage
withOnyx
receives a key -> val mappingthis.connectMappingToOnyx(mapping, propertyName)
-> new connectionOnyx.connect
matchingKeys
matchingKeys
makes a separate call toAsyncStorage.getItem
get(key) -> AsyncStorage.getItem(key)
AsyncStorage.getItem
is swallowed and end users of the Onyx library won't be able to intercept and handle errors themselves.Now all of the above happens for a single
withOnyx
usage e.g. One component wrapped with Onyx.In the simplest scenario each prop requested from Onyx would make a separate call to
AsyncStorage.getItem
So let's say 10 components use data from Onyx and each reads 2 props from Onyx. That's a total of 20
getItem
calls.In practice the calls would be a lot more though and this happens during startup
What happens under the hood
every
getItem
call would make a new SQL queryThe rest is just proving that statement - you can skip to Improvements if you trust me 😉
Async Storage for Android uses SQLite internally
Tracing this down you can see that every
getItem
call would make a new SQL queryImproving
Using
multiGet
The easiest improvement would be to use the optimization provider from
AsyncStorage
- AsyncStorage.multiGetAt least for the code here:
react-native-onyx/lib/Onyx.js
Lines 320 to 324 in 121c521
multiGet
Batch connect calls keys
The next best thing:
Don't call AsyncStorage as soon as
Onyx.connect
is called. Wait for a brief time or debounce the execution of AsyncStorage to run after all the calls toOnyx.connect
. Make a batch and each timeOnyx.connect
is called add to that batch, after the debounce period expires (lets say 5ms) fetch all the keys usingmultiGet
and respond to the listeners.When all components are initially mounting (App Startup) this should batch all the necessary keys and make just one query to AsyncStorage. It would also remove duplicate queries when multiple components use the same Onyx KEY
In Memory Proxy
The next next best thing:
Don't always read from file storage, make an "in memory proxy" and provide recently accessed keys
Let's say there are 5 components subscribing to
CURRENTLY_VIEWED_REPORTID
each of them would retrieve the valuefrom file storage. Instead only the initial call should read from file and return the value to the rest
Note on Onyx.set
Onyx.set does not resolve optimistically, it would first wait to perist to file
so Onyx set would not imeddiately update listeners, but only after write, if it's successful
AsyncStorage.setItem
->keyChanged(key, val)
You could make the change immediately. If it happens to fail you can retry in the background or handle it some other way
E.g. the value of the
CURRENTLY_VIEWED_REPORTID
updates let listeners know that right away no matter whatThe text was updated successfully, but these errors were encountered: