Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design Change and Optimizations for 1-Second Finality #3722

Open
rlan35 opened this issue May 21, 2021 · 0 comments
Open

Design Change and Optimizations for 1-Second Finality #3722

rlan35 opened this issue May 21, 2021 · 0 comments
Assignees
Labels
design Design and architectural plans/issues enhancement New feature or request

Comments

@rlan35
Copy link
Contributor

rlan35 commented May 21, 2021

Background

Block finality is the state of a block where it is impossible to revert the block or the cost of reversion is extremely high. Faster block finality brings faster finality to transactions and better user experience. Currently Harmony’s block and transaction finality is 2 seconds, which is top-notch among all the industry-leading blockchains. The goal is to continue improving the block finality down to 1 second.

Types of Consensus

Block finality is mostly determined by the type of the consensus mechanism used by the blockchain. Every blockchain has their own specific design for consensus with different properties on block time and block finality. The majority of the consensus mechanism can be summarized in mainly 3 categories:

Heaviest Chain Consensus

The most famous PoW consensus made Bitcoin and the whole blockchain industry possible. PoW consensus belongs to the type of heaviest-chain consensus. In heaviest-chain consensus, the canonical chain is determined by picking the chain with the largest measure of a specific metric, such as length of the chain, cumulative hashing power/stake, or number of signatures etc.

Heaviest chain consensus has a high uptime guarantee because as long as there are operational nodes in the network, new blocks can always be produced. However, the drawback of this consensus is that chain forks can happen. Thus, it can not provide instant block finality because a certain number of new blocks have to be added in the chain before it can be confidently picked as the canonical chain from all forks.

BFT-based Consensus:

Byzantine fault tolerance (BFT) algorithm is another approach to reach consensus in blockchain. The way BFT-based consensus works is by having a predetermined set of validators to sign on new blocks which can be finalized immediately as long as more than 2/3 of the validators signed. Technically, forks are not possible thanks to the 2/3 signing requirement in BFT consensus.

BFT-based consensus can provide instant finality of blocks and transactions but it comes with the drawback that there is a risk of potential network downtime due to insufficient number of validators signing on the blocks.

Hybrid Consensus:

Hybrid consensus is basically heaviest-chain consensus equipped with a finality gadget based on the BFT algorithm. This way, Hybrid consensus can provide faster finality while not having trouble with uptime.

  Heaviest-Chain Consensus BFT Consensus Hybrid Consensus
Forks forks expected forks not expected forks expected within N blocks
Finality Type Probabilistic Absolute every block Absolute every N blocks
Chains Bitcoin, Ethereum Cosmos, Harmony, Celo Eth2.0, Near, Polkadot

Chains Bitcoin, Ethereum Cosmos Harmony Celo Eth2.0 Near Polkadot
Finality minutes+ 8s 2s 5s minutes+ 3s+ up to 60s

How 2s Finality was Achieved

Harmony’s consensus is called FBFT(Fast Byzantine Fault Tolerance) which belongs to the BFT-based consensus category. FBFT combines BLS signature aggregation and synchronous view change technology to achieve a partially synchronous protocol with O(n) message load and 2s block time. In FBFT, every block requires signatures from validators with more than 2/3 of the voting power in the shard to guarantee security.

When Harmony mainnet was launched back in June 2019, the block finality was 8 seconds. After various optimization and protocol upgrades, we brought the block finality down to 2 seconds. The improvement we did can be summarized as:

Consensus Code Efficiency:

  • BLS private key, public key, signature data structure refactoring.
  • Avoid redundant serialization and deserialization of BLS key and signature data.
  • BLS multisig bitmap data structure refactoring to avoid redundant looping.
  • Remove redundant block data verification before block commit.

Consensus Design Upgrade:

  • Signature aggregation at the validator level to reduce the number of network messages and signature checking.
  • Consensus pipelining to fully utilize the waiting time of the leader to start a new block proposal as early as getting just 2/3 of the signature in commit phase.

P2P Network Messaging:

  • P2p message processing refactoring which introduces easier sanity check and filtering of invalid messages.
  • Remove redundant signature checking on p2p messages from validators and instead only checking the block signatures.

System Parameter Tuning:

  • Goroutine parameters fine tuning with GoMaxProcs set to be the same as the number of CPUs to avoid scheduler inefficiency.
  • Add level db caching to improve disk access performance.

The Road to 1s Finality

Although 2 second finality is already a great achievement and marks Harmony as one of the fastest blockchain in the industry, there is still room for improvement and 1 second finality is the ultimate goal. We can work on the following directions to further bring down our block finality:

Code & System Level:

  • Replace herumi bls implementation with faster blst implementation (currently blocked by bls base point compatibility issue)
    Currently our consensus is CPU-bound due to the nature of BFT consensus where lots of signatures need to be signed, verified and aggregated. It incurs a great amount of CPU stress on the validators, especially the leader. Replacing the bls implementation with a faster version could potentially reduce the finality by a few hundreds of milliseconds.
  • Employ multi-core CPU usage in leader’s signature verification process
    Currently the leader is using mutex to guard all consensus message processing logic at a high level to avoid potential issues. We could try making it more fine-grained so some logic can be run in parallel.
  • More profiling on the bottleneck of code and further optimization on code efficiency

Protocol Level:

  • Have a hard deadline of 1s for consensus and ignore all late signatures.
  • Do not broadcast block content but only the header and txn-ids and let validators to reconstruct the block content themselves (to reduce network load)
  • Redesign staking reward and crosslink logic to reduce the frequency of these logic from every block to once per X blocks.

Network Level:

  • Currently the whole network is meshed together without taking into account the sharding structure. The messages for the same shard are broadcasted to the whole network and filtered with shard-specific topics. This is not optimal in terms of network routing and usage. We could design a better network connectivity structure so nodes within the same shards are more closely connected, making the message broadcasting within the same shard much faster.

Infrastructure Upgrade

Once we achieve 1s finality on the consensus, there will be greater infrastructure stress which we should also be prepared with. Specifically, we should be prepared on the following aspects to make sure 1s finality can be launched smoothly without service interruption:

Faster Block and State Synchronization

With 1s finality, the rate of block production is getting closer to the block synchronization speed. It will take longer for a new node to synchronize the whole blockchain as there are constantly new blocks being produced at a fast pace. We should make sure our client can synchronize blocks at a much faster pace than the block production at every 1 second. Another feature we can introduce to solve the problem is the fast state synchronization technique where new nodes can quickly synchronize with the latest state without having to process all the blocks in history.
Robust and efficient explorer node infrastructure
With a faster block time of 1 second, the stress on our explorer node will be greater since it needs to both keep up with the block synchronization and serve the data to the users without significant latency. We will need a better design on the explorer and rpc endpoint service stack.

@rlan35 rlan35 added the enhancement New feature or request label May 21, 2021
@rlan35 rlan35 changed the title [Feature] Design Change and Optimizations for 1-Second Finality May 21, 2021
@rlan35 rlan35 added the design Design and architectural plans/issues label May 21, 2021
@ONECasey ONECasey self-assigned this Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Design and architectural plans/issues enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants