Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/memoize-block-commit-descendancy #3048

Merged
merged 34 commits into from
Apr 9, 2022
Merged

fix/memoize-block-commit-descendancy #3048

merged 34 commits into from
Apr 9, 2022

Conversation

jcnelson
Copy link
Member

@jcnelson jcnelson commented Feb 14, 2022

This PR addresses #3045 by adding a new field parent_sortition_id to the block_commits table in the sortition DB, which shall be equal to the sortition ID of the parent block-commit. This field is then used in SortitionHandleTx::descended_from to scan the list of block-commit ancestors to see if the winning block at a given height descends from a PoX anchor block.

Using this field is much, much faster than querying the sortition DB MARF for each block height in a reward cycle. In fact, the former behavior is the limiting factor in booting up a node with #3044 applied.

NOTE: this changes the sortition DB schema in an incompatible way. The release that incorporates this PR as-is will need a chainstate version number increment.

…tition_id is the sortition ID of the parent block commit. Use this field to *quickly* search the ancestor block-commits in SortitionHandleTx::descended_from() to see if a block-commit descends from an anchor block (the MARF is no longer queried)
@codecov
Copy link

codecov bot commented Feb 14, 2022

Codecov Report

Merging #3048 (9a5e390) into develop (659b2c4) will increase coverage by 0.09%.
The diff coverage is 94.88%.

@@             Coverage Diff             @@
##           develop    #3048      +/-   ##
===========================================
+ Coverage    83.60%   83.69%   +0.09%     
===========================================
  Files          248      248              
  Lines       196992   197466     +474     
===========================================
+ Hits        164701   165276     +575     
+ Misses       32291    32190     -101     
Impacted Files Coverage Δ
src/chainstate/stacks/db/blocks.rs 89.04% <ø> (ø)
src/net/download.rs 41.17% <0.00%> (-0.11%) ⬇️
testnet/stacks-node/src/neon_node.rs 82.18% <50.00%> (-0.12%) ⬇️
testnet/stacks-node/src/run_loop/neon.rs 80.93% <60.00%> (+0.30%) ⬆️
src/net/relay.rs 32.99% <73.58%> (+0.24%) ⬆️
src/net/p2p.rs 58.48% <80.00%> (-0.05%) ⬇️
src/chainstate/burn/db/sortdb.rs 95.48% <98.78%> (+0.32%) ⬆️
src/util_lib/db.rs 81.80% <100.00%> (-1.17%) ⬇️
src/net/atlas/download.rs 82.38% <0.00%> (-1.29%) ⬇️
... and 24 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9f60ce1...9a5e390. Read the comment docs.

@jcnelson
Copy link
Member Author

Feedback: don't change the chainstate schema in an incompatible way; instead, re-calculate parent_sortition_id for each block_commit row on boot-up with a chainstate that does not yet have it.

Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, just had a few superficial comments. Also, please update the CHANGELOG.md

reduce DB contention between relayer and chains coordinator threads
Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like it's a win! Thanks. Two questions about the PR description:

  1. Is this really true "This PR addresses [chainstate] memoize block-commit descendancy information for SortitionHandleTx::descended_from #3045 by adding a new field parent_sortition_id to the block_commits table in the sortition DB"? Seems like you are adding a new table block_commit_parents.
  2. Would be nice if we could quantify "Using this field is much, much faster", but can skip this if it's a huge hassle for some reason.

@@ -2398,8 +2435,8 @@ impl SortitionDB {
pub fn is_db_version_supported_in_epoch(epoch: StacksEpochId, version: &str) -> bool {
match epoch {
StacksEpochId::Epoch10 => false,
StacksEpochId::Epoch20 => (version == "1" || version == "2"),
StacksEpochId::Epoch2_05 => version == "2",
StacksEpochId::Epoch20 => (version == "1" || version == "2" || version == "3"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarification: does epoch20 support version 3?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The node passes through epoch 2.0 when it's booting. So, if you stop and restart a node while it's still in epoch 2.0, this check will be run and it should pass (since the sortition DB will have the latest schema).

Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but this should have an entry in the changelog that at least mentions that the update will apply a schema migrations (with a rough time estimate), so that users can expect the delay when they restart their nodes.

@jcnelson
Copy link
Member Author

Is this really true "This PR addresses #3045 by adding a new field parent_sortition_id to the block_commits table in the sortition DB"? Seems like you are adding a new table block_commit_parents.

This is no longer true, per feedback given here: #3048 (comment). In order to keep the sortition DB compatible with existing sortition DBs, a new table was added instead.

Would be nice if we could quantify "Using this field is much, much faster", but can skip this if it's a huge hassle for some reason.

Processing a sortition is about 10x faster now, when considering later sortitions (where the sortition DB MARF is bigger). It takes about 0.3s to process a sortition with this PR, and it took about 3s without it.

@gregorycoppola
Copy link
Contributor

Is this really true "This PR addresses #3045 by adding a new field parent_sortition_id to the block_commits table in the sortition DB"? Seems like you are adding a new table block_commit_parents.

This is no longer true, per feedback given here: #3048 (comment). In order to keep the sortition DB compatible with existing sortition DBs, a new table was added instead.
Can you update the description before checking in so we aren't confused later?

Would be nice if we could quantify "Using this field is much, much faster", but can skip this if it's a huge hassle for some reason.

Processing a sortition is about 10x faster now, when considering later sortitions (where the sortition DB MARF is bigger). It takes about 0.3s to process a sortition with this PR, and it took about 3s without it.

:)!

Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcnelson please update the PR description to match the change!

CHANGELOG.md Outdated
Comment on lines 11 to 12
- Sortition processing performance has been improved by about an order of
magnitude, by avoiding a slew of expensive database reads (#3048).
magnitude, by avoiding a slew of expensive database reads (#3045).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should mention (and warn) about the schema migration. Restarting a node, using the an existing chainstate directory, with this change will require more time than other updates have in the past, so users should get some kind of warning in the changelog (either in this entry, or at the top level content for the release).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@jcnelson
Copy link
Member Author

I noticed a couple bugs in the implementation when testing this on a live migration:

  • The migration of the canonical sortition history takes 3-4 hours. There needs to be a bigger warning about this.
  • The migration logic panics for non-canonical sortition histories (i.e. ones that aren't on the canonical PoX fork). This shouldn't happen.

@jcnelson
Copy link
Member Author

Don't bother doing a migration; just fall back to the old code if the block_commit_parents table doesn't exist or is filled with NULL data.

…ead, use it opportunistically, and if it doesn't yield any data, do the expensive MARF read
Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jcnelson jcnelson merged commit 280d574 into develop Apr 9, 2022
jcnelson added a commit that referenced this pull request Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants