-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of Blockstore.AllKeysChan() #3376
Comments
All of my proposed changes have been merged. With some additional work I was able to improve the time even more:
(Note that due to a different load on my system the numbers are slightly different from the previous run.) |
@kevina so the final result will be from roughly 3000ms to 205ms ? |
@whyrusleeping Yes, More or Less. To get a really accurate number I would need to rebuild the initial binary and rerun the benchmark code for the original and final several times to normalize the effect of the load on my system. So the speed up is well over an order of magnate and somewhere between 14 and 15 times. |
@kevina thats awesome :) |
@whyrusleeping Thanks, once ipfs/go-datastore#56 is merged and the dependencies are updated than I will consider this issue done and we can close this. Not using a channel for datastores queries and other possible improvements can be dealt with separately. |
Do you think that batching keys in the call (sending slices instead of keys) would help? |
All my changes are now in. I consider this done. Closing. |
@whyrusleeping asked me to look into improving the performance of
Blockstore.AllKeysChan
. I was able to improve performance by around an order by implementing a series of fairly simple fixes.Initial results for around 100,000 small (1000 byte) blocks: 3095.821125 ± 98.089091 ms
After fixing channel buffer size in namespace datastore (ipfs/go-datastore#52): 1428.792103 ± 37.367283 ms
After removing expensive debug reporting in the blockstore (#3378): 529.585905 ± 18.348316 ms
After using the optimized Walk in flatfs (ipfs/go-ds-flatfs#3): 355.931681 ± 7.984393 ms
Avoid doing unnecessary work in conversion to Cid (#3379): 357.190799 ± 9.451660 ms
When measuring real time and that is about all the improvements I can get on a dual core (with hyper-threading making 4 virtual cores) system. (I am also getting 265% CPU time on the last two runs so it could also be the hyper-threading in this case is slowing things down.) So I reran the last two tests while pinning the process to a single CPU (using taskset on linux).
After using the optimized Walk in flatfs (ipfs/go-ds-flatfs#3): 520.779686 ± 2.332627 ms
Avoid doing unnecessary work in conversion to Cid (#3379): 494.916185 ± 9.780639 ms
So the last optimization has a small but measurable improvement on a single core system.
Possible future improvements
The changes for the performance boosts so far where self contained and hopefully be uncontroversial. Although not part of this issue further improvement is likely possible. Here some additional results that measured the theoretical performance gains we can still get, but will likely involve some API changes.
Single CPU Results:
Results from configuring the blockstore to query the flatfs datastore directly: 295.838959 ± 4.161743 ms
Results from directly querying the Datastore directly and converting the key to Cid: 255.059754 ± 5.210151 ms
Results from directly querying the Datastore directly with no conversion to Cid: 151.604514 ± 1.888088 ms
This seams to imply the conversion to Cid takes up a lot of CPU time, but see below.
Multi CPU Results:
Results from configuring the blockstore to query the flatfs datastore directly: 208.996977 ± 10.234978 ms
Results from directly querying the Datastore directly and converting the key to Cid: 190.639903 ± 10.407054 ms
Results from directly querying the Datastore directly with no conversion to Cid: 185.266868 ± 8.534578 ms
The Conversion to Cid on Cid on a multi-core system may take a lot of CPU time, but does not effect the overall runtime thanks to the use of goroutines.
The text was updated successfully, but these errors were encountered: