RFC-0140/SyncAndSeeding
Synchronizing the Blockchain: Archival and Pruned Modes
Maintainer(s): S W van Heerden
Licence
Copyright 2022 The Tari Development Community
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
- Redistributions of this document must retain the above copyright notice, this list of conditions and the following disclaimer.
- Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
- Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
THIS DOCUMENT IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS", AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Language
The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY" and "OPTIONAL" in this document are to be interpreted as described in BCP 14 (covering RFC2119 and RFC8174) when, and only when, they appear in all capitals, as shown here.
Disclaimer
This document and its content are intended for information purposes only and may be subject to change or update without notice.
This document may include preliminary concepts that may or may not be in the process of being developed by the Tari community. The release of this document is intended solely for review and discussion by the community of the technological merits of the potential system outlined herein.
Goals
This Request for Comment (RFC) describes the syncing and pruning process.
Related Requests for Comment
Descriptions
Syncing
When a new node comes online, loses connection or encounters a chain reorganization that is longer than it can tolerate, it must enter syncing mode. This will allow it to recover its state to the newest up-to-date state. Syncing can be divided into two SynchronizationStrategys: complete sync and horizon sync. Complete sync means that the node communicates with an archive node to get the complete history of every single block from genesis block. Horizon Sync involves the node getting every block from its pruning horizon to current head, as well as every block header up to the genesis block.
To determine if the node needs to synchronise, the node will monitor the broadcasted chain_metadata
messages provided by its neighbours. The fields in the chain_metadata
messages MUST be:
Field | Description |
---|---|
height_of_longest_chain | 64-bit unsigned |
best_block | hash of the tip block (32-bit) |
pruning_horizon | 64-bit unsigned |
pruned_height | 64-bit unsigned |
accumulated_difficulty | 128-bit unsigned |
timestamp | 64-bit unsigned |
Complete Sync
Complete sync is only available from archival nodes, as these will be the only nodes that will be able to supply the complete history required to sync every block with every transaction from genesis block up onto current head.
Complete Sync Process
Once the base node has determined that it is lagging behind the network tip it will start to synchronise with the peer it determines to have all the data required to synchronise.
The syncing process MUST be done in the following steps:
- Set SynchronizationState to
header_sync
. - Sync all missing headers from the genesis block to the current chain tip. The initial header sync allows the node to confirm that the syncing peer does indeed have a fully intact chain from which to sync that adheres to this node's consensus rules and has a valid proof-of-work that is higher than any competing chains.
- Set SynchronizationState to
block_sync
. - Start downloading blocks from sync peer starting with the oldest block in our database. A fresh node will start from the genesis block.
- Download all block up to current head, validating and adding the blocks to the local chain storage as we go.
- Once all blocks have been downloaded up and including the current network tip set the SynchronizationState to
listening
.
After this process, the node will be in sync, and will be able to process blocks and transactions normally as they arrive.
Horizon Sync Process
The horizon sync process MUST be done in the following steps:
- Set SynchronizationState to
header_sync
. - Sync all missing headers from the genesis block to the current chain tip. The initial header sync allows the node to confirm that the syncing peer does indeed have a fully intact chain from which to sync that adheres to this nodes consensus rules and has a valid proof-of-work that is higher than any competing chains.
- Set SynchronizationState to
horizon_sync
. - Download all kernels from the current network tip back to this node's pruning horizon.
- Validate kernel MMR root against headers.
- Download all utxo's from the current network tip back to this node's pruning horizon.
- Validate outputs and utxo MMR.
- Validate the chain balances with the expected total emission that the final sync height.
- Once all kernels and utxos have been downloaded from the network tip back to this node's pruning horizon set
the SynchronizationState to
block_sync
. This hands over further syncing to the standard sync protocol which should return to thelistening
state if no further data has been received from peers.
After this process, the node will be in sync, and will be able to process blocks and transactions normally as they arrive.
Keeping in Sync
The node that is in the listening
state SHOULD periodically test a subset of its peers with ping messages to ensure
that they are alive. When a node sends a ping message, it MUST include the all the chain_metadata
fields. The
receiving node MUST reply with a pong message, which should also include its version of the chain_metadata
information.
When a node receives pong replies from the current ping round, or the timeout expires, the collected chain_metadata
replies will be examined to determine what the current best chain is, i.e. the chain with the most accumulated work.
If the best chain is longer than out chain data the node will set SynchronizationState to header_sync
and catch up
with the network.
Chain Forks
Chain forks occur in all decentralized proof-of-work blockchains. When the local node is in the listening
state it
will detect that it has fallen behind other nodes in the network. It will then perform a header sync and during the
header sync process will be able to detect that a chain fork has occurred. The header sync process will then determine
which chain is the correct chain with the highest accumulated work. If required this node will switch the best chain
and proceed to sync the new blocks required to catch up to the correct chain. This process is called a chain
reorganization or reorg.
Pruning
In Mimblewimble, the state can be completely verified using the current UTXO set (which contains the output commitments and range proofs), the set of excess signatures (contained in the transaction kernels) and the PoW. The full block and transaction history is not required. This allows base layer nodes to remove old spent inputs from the blockchain storage.
Pruning is only for the benefit of the local Base Node, as it reduces the local blockchain size. Pruning only happens after the block is older than the pruning horizon height. A Base Node will either run in archival mode or pruned mode. If the Base Node is running in archive mode, it MUST NOT prune.
When running in pruning mode, Base Nodes SHOULD remove all spent outputs that are older than the pruning horizon in their current stored UTXO set when a new block is received from another Base Node.
Change Log
Date | Change | Author |
---|---|---|
07 Feb 2019 | First draft | SWvheerden |
13 Jul 2021 | Update to better reflect the current implementation | philipr-za |
30 Sep 2022 | Rename title for clarity | stringhandler |
10 Nov 2022 | Minor update to reflect implementation | mrnaveira |
10 Oct 2023 | Minor update to reflect prune mode pruning interval | SWvheerden |