PDBFT2.0 — Pchain’s Revolutionary Consensus Algorithm For Solving The Trilemma
Maintaining high-speed consensus through multiple chains and dynamic shards
In our previous article, we talked a bit about what led up to Pchain and why Pchain made the framework decisions they did. That left us with a fairly high-level overview of how Pchain operates, but under the hood is a fascinating and complex evolution of PBFT that seemingly solves what is known as the Trilemma: every distributed network is a trade-off of time to finality, protocol overhead, and decentralization [Fig.1].
What this means is you can have optimal setups for two of the three variables but not all at once. Do you want a highly decentralized network with fast finality? Your nodes will have very high hardware and bandwidth requirements. Fast finality with low node requirements? You will have to limit how many nodes are operating as validators. Concerned only for a low barrier of entry for nodes to encourage high decentralization? Your time to finality will suffer.
This has always been accepted as truth. Even the most recent big hitter BFT implementations like EOS limit the node count to 21 to ensure a lightning-fast network of half-second blocks and less than 1-second finality at the cost of high node requirements that limit the throughput. Pchain, on the other hand, has been able to just about match the speed of EOS with 45 times more throughput and 200 nodes (currently). So how is this possible?
Sybil Control and Consensus
To set the field, let’s clear up a common misconception. Proof-of-Work and Proof-of-Stake are often used as the dividing line in explaining a network’s performance. While they certainly contribute to the end result, they are not the key factor, the consensus mechanism is. PoW and PoS alone don’t drive the network to agreement, they need to be paired with a consensus protocol. And PoW and PoS are not consensus mechanisms. They are simply mechanisms that prevent a single entity from operating a large number of nodes and building a disproportionately large influence in the network. This is called a Sybil attack and any protocol that prevents this type of attack is called a Sybil Control Mechanism.
What does determine how fast and how decentralized a network can be is the consensus mechanism. In PoW networks like Bitcoin, agreement on which chain is the main chain is by the heaviest/longest valid chain rule. As long as the blocks are valid, the chain with the most blocks wins. PoS mechanisms, on-the-other-hand, require a more complex decision-making process in order to reach the same security that PoW networks achieve with computational difficulty.
The earliest PoS networks actually used very similar consensus mechanisms to achieve agreement on valid chains. These heaviest/longest/most coin age/most cumulative difficulty consensus mechanisms are always limited to how fast information can travel through the network and how fast a single node can process transactions and create a block.
It turns out that achieving consensus in a network where everyone acts properly and can be trusted is pretty easy. It’s only when we try to protect against faulty or untrustworthy nodes that things get way more complex. This isn’t really a problem for PoW longest chain rule as bad nodes just don’t contribute but if a single bad actor had control of more than 50% of the networks hash power, they could grow a competing chain faster than the main chain. It is only in deviating from this mechanism in order to gain throughput that we had to address this problem.
This scenario is known as the Byzantine Generals problem came about that gave rise to a whole class of consensus mechanisms called Byzantine Fault Tolerant (BFT) consensus.
There are many different varieties of BFT mechanisms but the general process is usually the same. When some network’s state is to be updated, a node within the network proposes a block, this block is communicated to the other nodes in the network, the nodes agree or disagree according to their own independent block generation and send that decision to the proposer and the other nodes. If that proposed block receives approval from more than 2/3 of the total nodes the block is added to the chain; if not, a new proposer is selected and the round starts over [Fig. 2]. All of the details of this process determine how fast this can happen, how many nodes can participate in this process and therefore, how many transactions can be passed through the network in time.
You can see that this protocol is communication heavy, requiring many messages to and from every node to the other so all nodes end the round with the same block, each having received signatures from each node individually. In general, when the number of nodes increases, the total number of messages required to reach consensus grows exponentially in the number of nodes (O(2^n) where nis the number of nodes). These messages take time and bandwidth, therefore limiting how fast the network can operate. Because of this, projects using BFT sacrifice decentralization, limiting the number of consensus nodes in order to achieve a fast and scalable network. NEO limits the nodes to 7, INT to 13, EOS to 21, Ripple 31 with varying levels of trust.
So do we have any hope of achieving the same speed as EOS with high decentralization? Let’s look at the BFT communication protocol again. The one process that takes the most time in coming to an agreement is the collecting of signatures on the proposed block. Because there is no way of gathering signatures on a block without passing the block around for nodes to sign on, and no way to independently judge Byzantine behavior without all nodes validating all responses, every node must send their response to every other node and check every signature received. The key to unlocking more throughput and decentralization is in reducing the number of messages between nodes to achieve consensus.
Aggregating signatures has been a bit of a hot topic recently having applications in lightweight multisig transactions, combining transactions in a block for scalability and privacy (as in CoinJoin), and even in confidential transactions (Mimblewimble). The main player in this space has been Schnorr signatures.
Up to this point, almost all cryptocurrencies have been using some form of Elliptic Curve DSA for the generation of keys and signatures. They do the job well, they are fast verifying and they are lightweight. But that’s all. They cannot combine signatures or keys and each has to be verified independently. Schnorr signatures, on the other hand, are additive in nature. This means that you can quickly add keys or signatures together and the result is a valid key or signature. In Bitcoin, this could be applied by adding up all the transaction signatures in a block to batch validate the whole block. This cuts the validation calculation down considerably. This also makes multisig transactions much more straight forward and private. Instead of these transactions being signed by each party (which then validates by checking each signature individually), a Schnorr multisig transaction creates a single signature from both keys and therefore does not look any different than a single key transaction. But this all as one significant drawback, Schnorr signature protocol requires you to interact with those you wish to aggregate your signature with to come up with a shared secret. This really limits the uses to multisig wallets and even confidential transactions but it doesn’t help our problem of trying to cut down on communication in BFT consensus.
In normal ECDSA or Schnorr signatures, we hash the message and use the result as a number in the generation of the signature. By doing a bit more work generating the signature, we can hash the message to the elliptic curve which results in a point on the curve. This seemingly small change in algorithm is called the Boneh–Lynn–Shacham(BLS) Signature Scheme.
If you recall from our article on Elliptic Curve Cryptography, any point on the curve can be used to generate a public-private key pair. By taking a number, n, and multiplying it by a point on the curve G, you get another point P, on the same curve.
Using this algorithm, we can use a number as our private key and the resultant point P is our public key. So if we now use the hash of the message, H(m), as our generator point G, and multiply it by our private key pk, we get another point on the curve S.
This point S has just the same security as our public-private key pairs so it can be used as a signature that cannot be created unless you possess the private key. So that’s it! There’s the signature, no extra operations, just a hash times the private key. And being that it is just a point on a curve, they are extremely lightweight at 33 bytes!
But the real magic of these signatures is in their additive nature. Because the signatures are points on the elliptic curve, you can add points together and the result is another valid point. So if we have signature 1, S1 = pk1•H(m1), and signature 2, S2 = pk2•H(m2), we can combine them into signature 3, S3 = S1+ S2 = (pk1+ pk2) •(H(m1) + H(m2)). This result is a valid signature that spends two transactions and is the same size as a single signature! And all made without needing to combine keys or interact with the other sender.
What this means for Pchain
So if we can now combine signatures without having to interact, we no longer need to pass around the block proposal to gather signatures. Using BLS signatures, the nodes can sign the proposed block hash and just send that back to the proposer. The proposer then takes advantage of the multiplicative nature of the signatures and combines them all to form a single aggregate signature on the block from all the nodes [Fig. 3]. This reduces the total communication rounds from all-to-all to one-to-all.
You can see that with each additional node, there are only 5 additional messages to be sent. This is quite an improvement to traditional PBFT where each additional node drives 2n additional messages, where n is the total number of nodes in the validation pool. So the total overhead for the network can be calculated by the number of messages required by each node multiplied by the number of nodes in the validation pool. For PBFT, this is 2n messages across nnodes, or 2n*n or 2n². This is an exponentially increasing overhead. For BLS enabled PBFT2.0, this is 5 messages across n nodes, or 5n. This is a linearly increasing overhead.
So in a BFT network of 7 nodes (NEO), the total messages required to reach consensus would be ~2(7)² or ~98 messages. Whereas PBFT2.0 reduces this to ~5n or ~35 messages. The difference gets even larger though if we push this to what is considered the extreme end of BFT capability as is seen in the EOS network. In the EOS network, 21 nodes work to reach consensus once every 500 milliseconds. In this time, 21 nodes send ~2n² or ~880 messages. This works out to a staggering ~1,800 messages per second. If we take the message overhead of this network to be the peak ability of our P2P network capability and scale our PBFT2.0 network to match it (at 1-second block frequency), we could include a staggering ~360 nodes in the consensus process. That means the addition of BLS signatures allows us to have a network operating just as fast at more than 17x more decentralized with much lower hardware requirements.
What this means for sharding
This signature process allows us to operate the network with not only high speed but also decentralization. But even at this limit of performance, EOS and others like it are limited by network latency and bandwidth, capping throughput to around 4,000 TPS. This marks the true limit of single-chain frameworks. In order to further scale, Pchain creates parallel chains, all operating in line with each other in order to multiply the throughput of the network. Like adding lanes to a highway, each child chain has the same capacity and characteristics of the main chain, with its high speed and decentralization, while being able to operate as a stand-alone blockchain.
In non-BLS signature-based networks, splitting the network like this would entail another round of prepare and commit for that shard’s block proposal on the main chain after the nodes within the shard come to a consensus. This would drive overhead to 2n_m²+S*(2n_m²+n_s)(where n_m is the number of main chain nodes, S is the number of shards and n_s is the total number of shard nodes) across the main chain nodes. You only have to run through one example to see that this would never be feasible. Even in the least decentralized BFT network, NEO, with its 7 nodes, would have it’s network overhead go from ~100 messages to ~1,800 for 10 shards. This blows up to ~24,000 for a 10 shard, 31 node network. So again, you have to choose between decentralization and scalability.
Incorporating BLS signatures simplifies reaching cross-shard/main chain to child chain consensus greatly. Shards or child chains can include all of the main chain validator nodes in the consensus of the shard chain with only a linear overhead cost increase to the network. By having the shard leader send the proposal to all the main chain nodes, the nodes can then sign and send to the main leader to aggregate the signatures and send it back to the shard nodes. This drives the total overhead to be 5n_m+S*(2n_m), a total factor of 2n_m+n_s less than non-BLS enabled sharding. To illustrate the overhead savings, 7 node, BLS sig enabled, 10 shard network would have a total network overhead of ~175 versus the ~1,800 from above. The real divergence is when we start to decentralize this framework. Increasing the number of nodes by 4 times, to 31 gets us an overhead of ~775 versus the ~24,000 above.
Pchain only does this “checkpointing” between chains as explained above after a certain number of blocks, or “epoch”, to combine the security of the shard chains and the main chain. So the added overhead of this operation is not a constant cost but allows you to operate parallel blockchains within one network. This effectively multiplies the security of the whole network by the number of shards or child chains within it.
So what this leaves us with is a network that can scale to 300+ nodes on the main chain and another 300+ different nodes on each child chain. Pchain has tested its framework up to 256 child chains with 200 nodes in each chain, pushing a remarkable 180,000 TPS. That provided a network of 51,200 individually verifying nodes, ~2,000 times more decentralized than Ethereum and 45 times faster than EOS.
Taking scaling to the next level
As impressive as the numbers above are, when you break it down, 180k TPS divided up by 256 child chains only comes out to ~700 TPS in each chain. Cross-chain transactions are not difficult but take extra time to validate so they aren’t optimal for scaling transactions of the same purpose. There may be some smart contracts, Dapps or even basic transactions that will require more transactions per second than that. Pchain takes advantage of the BLS signatures which allow the network to easily incorporate child chains to further shard within chains.
This inter-chain sharding (called transaction-level sharding), takes advantage of the high number of nodes within the child chain to split the validating group into subgroups of nodes. These subgroups split up the work of validating transactions by dividing up the transaction set between all the node subgroups. Each subgroup then validates their subset of transactions and then a leader node aggreagates all the signatures into one block for the chain. This division of labor makes it so where one chain could previously do 700 TPS with 200 nodes, this transaction-level sharding enables this node group to be split into ~15 subgroups each validating ~700 transactions per second, pushing the single-chain limit to upwards of 10,000 TPS.
Drawbacks of BLS
BLS certainly sounds like the end-all algorithm for cryptographic signatures. They create signatures 2 times shorter than Schnorr or ECDSA, they don’t grow in size when aggregating, they don’t require interaction to combine signatures or keys, block signatures can be combined into one signature greatly reducing communication rounds in BFT consensus. But I left out one part of this set up on purpose.
Verifying these signatures requires a special function that takes two elliptic curve points and maps them to a number. Called Elliptic Curve Pairing, this function enables you to make use of the additive nature of elliptic curves while not revealing numbers used, like a form of encrypted multiplication. I won’t go into the details here as it is very mathematically intense (and if you are interested in learning more, read this), but this pairing function allows us to verify the signature S, with public key P, given the hash of the message H(m) and generator point G, by checking the equality of:
e(P, H(m)) = e(G, S)
where e(P,Q) is the pairing function of P and Q.
It turns out that generating these pairing functions (which you have to do in order to verify each signature) is quite computationally difficult. In fact, verifying a BLS signature is about 1,000 times more difficult than verifying a Schnorr or ECDSA signature. While not a protocol killer, it limits the realistic implementations of this signature scheme. In this application, it requires the validator nodes to have higher computational power than an ECDSA implementation.
PDBFT2.0 — Pchain’s Revolutionary Consensus Algorithm For Solving The Trilemma was originally published in Coinmonks on Medium, where people are continuing the conversation by highlighting and responding to this story.