Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chainstate] store large MARF tries in a separate flat file #3059

Closed
jcnelson opened this issue Feb 21, 2022 · 2 comments
Closed

[chainstate] store large MARF tries in a separate flat file #3059

jcnelson opened this issue Feb 21, 2022 · 2 comments
Assignees
Labels

Comments

@jcnelson
Copy link
Member

Once a blob gets to be bigger than the sqlite page size (which defaults to 4096), loading data from a blob starts to become expensive. Per sqlite's own benchmarks [1], once the blob size exceeds 100kb, reading from a file becomes faster no matter what page size is used. This suggests to me that the TrieFileStorage system should maintain a separate flat file for storing tries that exceed 100kb. This is most of them for bigger blocks -- as of right now, 31,428 out of 54,400 tries in the vm/clarity/marf.sqlite file (almost 60%) exceed 100kb. The sqlite db would instead store an offset and length in this file of the trie blob, and provide a Read + Seek-implementing struct for accessing it.

[1] https://www.sqlite.org/intern-v-extern-blob.html

@jcnelson jcnelson self-assigned this Feb 21, 2022
@jcnelson
Copy link
Member Author

In a test with 16384 MARF inserts over 32 blocks, the performance difference in reading nodes is over an order of magnitude better for tries stored in a flat file:

Total nodes read: 2,527,929
Total time spent reading nodes stored in external file: 7,667,938,853 ns
Total time spent reading nodes stored in SQLite blobs: 109,299,929,610 ns

It takes just over 3 microseconds to read a node if it's in an external file. It takes about 43 microseconds if it's in a SQLite blob.

Also, this is with opt-level = 3.

@blockstack-devops
Copy link
Contributor

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@stacks-network stacks-network locked as resolved and limited conversation to collaborators Nov 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants