MARF Performance Improvements #3044

jcnelson · 2022-02-10T18:49:09Z

This PR significantly improves the read and write paths for MARFs by between a factor of 10x and 200x, depending on how many paths are inserted into each trie. It fixes #3059, #3042, #3041, and does some work towards fixing #3014.

On the write path, this code adds the ability to defer calculating a trie's node hashes until when the trie is ready to be dumped to disk. This leads to a significant performance improvement because it means that an intermediate node's hash is only ever calculated at most once, whereas before, it could be calculated up to as many times as there were leaf inserts. Over 90% of the write path wall-clock time was spent in calculating node hashes that would later be overwritten by subsequent hash calculations.

On the read path, this code does two things:

It adds a caching layer for nodes, so that miners and node operators with lots of RAM can opt to cache some or all of the trie nodes in RAM. The current cache strategies (to be expanded in later PRs) are noop (default; does nothing), everything (caches all nodes in RAM once they are loaded), and node256 (caches all TrieNode256 nodes in RAM once loaded). The new cache layer also subsumes the caches for holding block hashes.
It allows the MARF owner to opt to store tries in a separate flat file, called a .blobs file. Before, the tries would be stored as sqlite blobs in the database. However, for large tries, such as those produced by the Clarity VM, the blob paging logic in sqlite significantly degraded performance on the read path when walking tries. Storing large tries in a flat file (i.e. tries bigger than one page size) removes sqlite from the trie node read path altogether, and yields up to a 14x improvement. In addition, this PR sets the sqlite page size to 32k (we were using the default of 4k), which is big enough to hold other MARF tries from the sortition DB and headers DB in a single sqlite page.

In addition, this PR introduces a simple profiler for measuring how long certain hot paths take. New measurements can be incrementally added in.

Without this PR, a node in master takes about 3 days to boot from genesis. With this PR, it takes about 28 hours.

Don't let the diff line count intimidate you. A lot of it is from indenting the unit test code needing to be indented by one tab in order to try each unit test with different MARFOpenOpt values.

…ndex methods

… committing it

…nt .seal()

…, and #[inline] common methods for reading

…are used when storing key/value pairs; also, refactor all unit tests to use different variations of MARFOpenOpts

…() for proof failures

…re ready to dump a TrieRAM to disk, and to cache nodes from disk-baked tries. This considerably speeds up the write path for the MARF. A TrieRAM, now folded into a LastExtended enum, can now be in either a read/write state, or a sealed state; in the latter case, only reads and commit/abort are permitted, and the act of converting a read/write to a sealed TrieRAM will calculate all node hashes and return the MARF root hash.

…ng mode. Also, refactor all tests to use different hashing and node caching modes.

…a in RAM

… of opening the block, storing key/value pairs, and sealing the block in put_indexed_all()

…surprisingly faster.

codecov · 2022-02-10T19:21:03Z

Codecov Report

Merging #3044 (9ae7bb1) into develop (9a5e390) will decrease coverage by 0.09%.
The diff coverage is 79.51%.

@@             Coverage Diff             @@
##           develop    #3044      +/-   ##
===========================================
- Coverage    83.69%   83.60%   -0.10%     
===========================================
  Files          248      259      +11     
  Lines       197466   200544    +3078     
===========================================
+ Hits        165276   167670    +2394     
- Misses       32190    32874     +684

Impacted Files	Coverage Δ
src/chainstate/stacks/index/mod.rs	`54.06% <0.00%> (-23.99%)`	⬇️
src/chainstate/stacks/index/node.rs	`85.66% <ø> (-12.18%)`	⬇️
src/chainstate/stacks/index/test/node.rs	`100.00% <ø> (ø)`
src/main.rs	`0.11% <0.00%> (-0.01%)`	⬇️
stacks-common/src/types/mod.rs	`93.22% <ø> (ø)`
src/chainstate/stacks/index/proofs.rs	`68.86% <14.28%> (-4.79%)`	⬇️
src/chainstate/stacks/index/test/marf.rs	`44.36% <44.36%> (ø)`
testnet/stacks-node/src/tests/neon_integrations.rs	`81.51% <50.00%> (-6.77%)`	⬇️
src/chainstate/stacks/miner.rs	`92.96% <53.84%> (-0.14%)`	⬇️
testnet/stacks-node/src/node.rs	`83.19% <61.53%> (+<0.01%)`	⬆️
... and 82 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 280d574...9ae7bb1. Read the comment docs.

gregorycoppola · 2022-02-10T21:23:07Z

Nice result. Will look. Thanks for sending.

Is there a command line to run the experiment?

gregorycoppola · 2022-02-10T21:36:51Z

@jcnelson

I know how to run an experiment, I realized. You don't have to worry about sending it on any time frame.

Curious to know how you did it but I can also use the "mempool analyzer". (Except I think my current mempool state is not up to date.)

src/chainstate/stacks/index/storage.rs

src/chainstate/stacks/index/trie.rs

src/clarity_vm/database/marf.rs

src/chainstate/stacks/index/trie_sql.rs

kantai

This all looks great to me, just a few comments and questions.

One high-level comment is around cache control via environment variables. It would be much better for that to be controllable via the stacks-node::config module -- almost all of the configuration variables are currently controlled there, and avoiding adding more environment variable switched behaviors would be wise.

…ile, and use them to create a MARFOpenOpts to pass into the StacksChainState

…hain into develop

gregorycoppola

Seems like this is working so LGTM.

Thanks for adding all this new functionality!

src/chainstate/stacks/index/storage.rs

kantai

LGTM! Excited to see these improvements in the node.

jcnelson added 23 commits February 8, 2022 13:02

chore: API sync with new MARF::open()

4e84766

chore: API sync with MARF::open()

a548adc

Merge branch 'develop' into feat/marf-node-cache

ce1f5bd

refactor: constrain transaction type to prevent accidental usage of i…

50bcf52

…ndex methods

chore: use the safer .put_indexed_all() one-shot method

c2c93aa

refactor: use the new .seal() method to get the MARF root hash before…

e36da5a

… committing it

refactor: use the more-tightly constrained db transaction type

9053b8b

refactor: use the new .put_indexed_all() one-shot method, and impleme…

bcb4ffe

…nt .seal()

feat: implement ptrs_from_bytes() to require one fewer I/O operations…

99c02f8

…, and #[inline] common methods for reading

feat: new MARFOpenOpts to control how node caching and hash deferral …

4e1a88a

…are used when storing key/value pairs; also, refactor all unit tests to use different variations of MARFOpenOpts

feat: add new is_block_hash_cached() trait method

172a534

chore: update tests to use new MARFOpenOpts

2684173

chore: update tests to use new MARFOpenOpts, and also use test_debug!…

555d9b8

…() for proof failures

feat: only calculate children hashes if we're *not* in deferred hashi…

1e3a002

…ng mode. Also, refactor all tests to use different hashing and node caching modes.

chore: use .seal() when mining a block to get the state root

7cf8645

feat: implement a node cache, with various strategies for keeping dat…

240b175

…a in RAM

chore: replace get_root_hash() with seal()

6fcfe0c

chore: use MARFOpenOpts and use .seal() in place of .get_root_hash()

f34033c

chore: use new put_indexed_all() method in MARF'ed databases

90cc661

chore: use MARFOpenOpts

6a9caec

feat: make indexed DB transactions safer to use by doing all the work…

8855f9e

… of opening the block, storing key/value pairs, and sealing the block in put_indexed_all()

chore: don't compile trace!() paths outside of tests. Makes the code …

da00369

…surprisingly faster.

jcnelson marked this pull request as draft February 10, 2022 19:15

fix: update unit tests so they know about .seal()

226bf34

kantai changed the title ~~Feat/marf node cache~~ Draft: MARF Caching Feb 11, 2022

jcnelson mentioned this pull request Feb 14, 2022

fix/memoize-block-commit-descendancy #3048

Merged