Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[chainstate] store large MARF tries in a separate flat file #3059

Closed
jcnelson opened this issue Feb 21, 2022 · 1 comment
Closed

[chainstate] store large MARF tries in a separate flat file #3059

jcnelson opened this issue Feb 21, 2022 · 1 comment
Assignees

Comments

@jcnelson
Copy link
Member

Once a blob gets to be bigger than the sqlite page size (which defaults to 4096), loading data from a blob starts to become expensive. Per sqlite's own benchmarks [1], once the blob size exceeds 100kb, reading from a file becomes faster no matter what page size is used. This suggests to me that the TrieFileStorage system should maintain a separate flat file for storing tries that exceed 100kb. This is most of them for bigger blocks -- as of right now, 31,428 out of 54,400 tries in the vm/clarity/marf.sqlite file (almost 60%) exceed 100kb. The sqlite db would instead store an offset and length in this file of the trie blob, and provide a Read + Seek-implementing struct for accessing it.

[1] https://www.sqlite.org/intern-v-extern-blob.html

@jcnelson jcnelson self-assigned this Feb 21, 2022
@jcnelson
Copy link
Member Author

In a test with 16384 MARF inserts over 32 blocks, the performance difference in reading nodes is over an order of magnitude better for tries stored in a flat file:

Total nodes read: 2,527,929
Total time spent reading nodes stored in external file: 7,667,938,853 ns
Total time spent reading nodes stored in SQLite blobs: 109,299,929,610 ns

It takes just over 3 microseconds to read a node if it's in an external file. It takes about 43 microseconds if it's in a SQLite blob.

Also, this is with opt-level = 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant