Continuing the discussion from EOS Governance Draft Proposal Part 2:
@bytemaster has a couple of topics in the EOS category regarding a new governance mechanism (Eden) he proposes (to start with) for the EOS community. That discussion led to how the top-level representatives elected via that proposed on-chain governance process could then work together (as a greater than two-thirds supermajority multisig) to approve changes specific to their host blockchain (e.g. the EOS Public Blockchain) if the existing host blockchain governance decided to follow that path.
@faddat brought up some points that I thought were interesting to address but veered away from the topic of the Eden governance proposal and into the mechanisms in blockchains (specifically regarding securely maintaining consensus) that potentially has applications to instantiations of EOSIO blockchains and could benefit from enhancements to the EOSIO protocol itself.
So I started this thread in the “EOSIO General” category to respond with some of my thoughts on the points raised and to potentially seed discussion along similar lines both as replies to this topic and as additional topics within this category. I considered adding it to the “EOSIO Developers” channel since some of this may get a bit technical, but ultimately decided against it because I got the sense that the “EOSIO Developers” category is more about code built on top of the EOSIO protocol as it currently exists and possibly even the layer of system contracts typically deployed on EOSIO instantiations (e.g. on the EOS Public Blockchain). If the mods/admins think this is in the wrong category, please feel free to move it.
Connection to other Governance Discussions
First, I would like to give some thoughts on how I see this discussion about blockchain consensus governance being separate from the other governance proposals being discussed such as Eden. But also how the two could complement each other.
The way I see it, Eden is a framework to allow people to come together to form a community to manage members of that community (protect their community from attackers) and to elect leaders (at multiple levels with different degrees of authority, privileges, and responsibilities) to represent their interests and make decisions on their behalf. While it benefits from its mechanisms being implemented on a blockchain (especially when it comes to the mechanisms of bonded tokens and penalties for misconduct), it doesn’t really have to do with blockchains generally. One can imagine building such a system in a very old-fashioned way with in-person meetings, paper and pen, items of value for the bonds and penalties (cash, precious metals, etc.), and some degree of trust (unfortunately more than would be required if the mechanisms were implemented on a blockchain).
For blockchain systems to function they need a proper mechanism of consensus. Here I am using “consensus mechanism” more broadly. I don’t refer solely to the actual protocol as viewed from the narrow lens of computer science (CS). There we could just talk about different BFT protocols (which granted is an interesting discussion to have as it goes it to things like message bandwidth overhead, latency to finality, etc., and it is something I would like to eventually touch on partly in this thread but also hopefully others in the future). But after considering all the CS aspects, at the end one can basically reduce it all to the following:
- The protocol must determine who is allowed to propose specific additions to the ledger (typically in the form of blocks) which are to be evaluated by the network and potentially be considered finalized into the “canonical chain”.
- The protocol must determine the mechanism through which messages contributed by various (hopefully independent) parties can be aggregated together to act as a signal that a particular proposed addition to the ledger (i.e. a block) is considered “final” or “irreversible”. (Note that in some systems, like typical Proof-of-Work, there is no objectively agreed upon standard within the protocol for deciding that a block is final and will never be reverted. Each person is expected to make that judgement for themselves for the transactions they care about, usually following some common rule of thumb almost everyone follows. For example, for high value payments people are recommended to wait for 6 block confirmations on Bitcoin.)
- The protocol must have a way to change parameters related to the consensus process over time. (For examples in protocols with algorithmic finality, where the protocol objectively defines the process that allows a block to suddenly transition into a state where it is irreversible, there is some defined set of validators from which some threshold, e.g. more than two-thirds, must agree for the process to reach a point where a proposed block is considered final. But it is not desirable for this set of validators to remain static. So the protocol needs to provide a mechanism to change this set, but to do so in a way that is sound and those not compromise the integrity of the consensus mechanism during the transition.)
In the case of Bitcoin, item 1 is addressed by allowing anyone to propose a block but which will only be seriously considered if there is enough hashing work done on the block (i.e. there is enough value permanently expended on the block in the form of running expensive computations which unfortunately consume large amounts of electric power). Item 2 is addressed by not actually ever considering any block final but providing a practical solution to make the Bitcoin consensus useful by: choosing the canonical chain by the one with the most work done; suggesting a rule of thumb (regarding the number of block confirmations to wait for) in subjectively deciding when a block can be considered close enough to irreversible. Item 3 is addressed by having very few parameters related to the consensus process that even can change and allowing those parameters to change automatically according to hard-coded rules that are dependent on a very limited set of inputs (basically the timestamps of blocks and the work done in blocks automatically drives difficulty adjustments and that is more or less it for Bitcoin).
In the case of the EOSIO protocol, item 1 is addressed by defining a limited set of entities (accounts registered with block signing keys) who are allowed to propose blocks. These entities (also called the active block producers) are arranged in a schedule which limits when (which time slot) their proposed block will be considered by the rest of the network. There is an expectation of coordination among these block proposers to ensure the network is operating smoothly unlike the highly competitive nature of block production in systems like Bitcoin. Operating in a cooperative environment provides EOSIO blockchains the benefit of very fast block times (i.e. low latency to get the first confirmation of a transaction). (Note that despite popular belief, I do not believe this cooperative nature of block proposing in EOSIO gives it much of an advantage in average throughput. Though it would have to be put the test and actually run benchmarks, I believe a Power-of-Work version of the EOSIO protocol could sustain close to the same average throughput that it currently can sustain, though its desirable latency properties would suffer.)
Furthermore, cooperation means that after a very low number of confirmations of a transaction (basically just two), there is a high likelihood that the transaction will remain in the canonical chain and will eventually be finalized. For those who want certainty however, they can wait until the block containing their transaction is considered finalized by the protocol which occurs after a well-defined process completes since the EOSIO protocol has algorithmic finality (this partially addresses item 2). I won’t go into detail regarding that consensus protocol here, but to better address item 2 I will mention that it involves getting two rounds of signed confirmations from a more than two-thirds subset of the set of chosen current validators on the network. The EOSIO protocol decides to make that set of validators exactly the same as the set of block producers; they are simply referred to as the active block producers either way but I will try to be careful in distinguishing them as block proposers versus validators depending on the context to avoid assuming the special case simplification that the EOSIO protocol chose. Because the block proposers and the validators are the same, the EOSIO protocol allows a block proposer’s signature (which is helpful in convincing full nodes to spend resources processing the block since they trust the block proposer to not waste their time with invalid and long-running computation) to also act as a validator signature which helps advance the consensus process that ultimately leads to finality. An additional field in the block header is used to provide the extra data needed to correctly handle the validation aspect of consensus. The data in the block header in addition to the signature act as the messages from validators that are aggregated over time (over the course of the chain’s advancement) to eventually signal that some prior block in the chain is now considered final. EOSIO constrains the rate at which these messages can be aggregated (only via the message + signature provided by a block proposer / validator when it is their turn to propose a block according to the schedule) in order to avoid too much overhead in communication and especially storage of many different signatures per block that scales with the number of validators. The downside of this constraint, however, is that the latency to finality scales with the number of validators. On the EOS Public Blockchain with 21 validators this approximately corresponds to a 3 minute latency to finality assuming nearly all block producers are actively participating in the live process. If the system contracts on EOS were updated to bump up the number of active block producers to 63 (which is allowed by the EOSIO protocol by the way), it would raise the latency to finality to approximately 9 minutes. Other consensus protocols, e.g. Tendermint, do not arbitrarily constrain the rate at which consensus messages can be generated and aggregated which enables the latency to finality to be low with weak sensitivity to the number of validators.
Regarding item 3, that is where the EOSIO protocol really shines. It is important to remember that what people think of as a typical EOSIO deployment (e.g. the EOS Public Blockchain) involves a coordination between the base layer (the raw EOSIO protocol itself) and the system contract layer built on top of the base layer. On the EOS Public Blockchain, most of what people think of as its consensus protocol is actually implemented at the system contract layer. The limit of 21 active block producers, the payment / incentivization model for the block producers, the stake weighted approval voting that determines the block producer candidate rankings, and even the very concept of stake (or tokens) are all implement at that system layer. People say that EOSIO uses Delegate Proof of Stake (DPoS). This is not accurate. The EOSIO protocol does not even define the concept of tokens, much less stake (i.e. token) weighted voting. What the EOSIO protocol exposes is a privileged API that can be used by special system contracts deployed on the chain to choose the new block producer schedule (and in doing so that also defines the validator set). The rest is implemented in system contract code. But this also means that it can be changed at any time without a hard fork of the protocol. In fact, the mechanism of how to decide when to change these critical system smart contracts can also be coded and governed by the very same smart contracts. Right now the EOS Public Blockchain decides to allow the active block producers (i.e. the top 21 block producer candidates ranked by approval voting with staked EOS tokens) to form a multisig in which with greater than two-thirds agreement they can change any of the contracts. Alternative systems could be devised including picking some other trusted group to act as the multisig that are independent of the block producers, or just forcing all the contracts to be immutable so nobody could change them without a hard fork (though I do not recommend that). These mechanisms are a critical part of what I describe as blockchain consensus governance.
So this connects to the Eden governance discussions in particular because the EOS Public Blockchain could, due to the flexibility of the EOSIO platform it is implemented on, transition to giving the multisig powers over the blockchain consensus governance to the top-level elected representatives of the communities represented within the Eden system if desired (i.e. if the EOS community considered that was preferable). And they could do this without even requiring changes to the EOSIO protocol. Now this thread is not about arguing for the advantage or disadvantages of doing that for the EOS Public Blockchain; that can be left for discussion threads within the “EOS” category. This thread is primarily about exposing what can be considered in scope of what I call “blockchain consensus governance”, technically how that could leverage something like Eden for blockchain networks that decide that would be to beneficial to them, as well other exploration of the aspects of blockchain consensus and its governance as it relates to EOSIO and how that could be improved over time perhaps with the help of upgrades to the EOSIO protocol.
Exploration and Discussion
So all of the above was basically background introducing what this topic was really about and providing a short primer on the EOSIO protocol as it relates to consensus so that the rest of this discussion within this thread could hopefully be more meaningful to the reader.
In particular, I would like to respond with my personal opinions (not representative of anyone else) to some comments left in the post by @faddat I referenced at the top of this post.
I think Tendermint is alright and I referenced some of its advantages in my discussion in the Background section above. There are many different BFT consensus algorithms out there that allow finality to be reached without arbitrary constraints on the rate messages can be passed back and forth (just the constraints of speed of light and information processing networks). Personally, I found the HotStuff protocol (PDF warning) pretty interesting; particular the Chained HotStuff variant of it. One of the co-authors has blogged about it comparing it to other BFT consensus algorithms including Tendermint.
One of the properties I think is really nice to have in a BFT consensus algorithm is linearity, which HotStuff (and Tendermint as well to some degree) have as opposed to some other algorithms (e.g. PBFT and SBFT which are also compared in that blog post). Linearity means that the communication overhead scales linearly (as opposed to super-linearly, e.g. quadratically) with the number of validators. This becomes really important when you want to have a large number of validators (e.g. much greater than 100). It achieves this partially due to its use of proper threshold signatures (e.g. BLS signatures would work) that can be aggregated as needed to end up with just one final signature that must be relayed back to the other validators from the current leader (and, by the way, could be stored in the block headers so that they are made available to other full nodes that need to see the final proof of finality for themselves).
It also has better responsiveness to view-changes (useful if you have unreliable leaders as you were indicating) than Tendermint at the cost of having an extra round of communication in the normal case. However, with Chained HotStuff, it takes advantage of pipelining the process so that this extra round of communication is amortized. So I think it provides considerable advantages and the main disadvantage it adds is that the latency to finality could be slightly longer in the normal healthy case compared to Tendermint, but I think that would already be done fast enough with both consensus algorithms that the difference would not be significant for most application use cases.
Personally, I think it would be really cool if the EOSIO protocol further generalized its consensus process by actually allowing general computation by privileged smart contracts for the block header validation and state transition. This change (which to be fair is complicated), in addition to support for BLS signatures (which for practical purposes requires exposing a few host functions for an appropriate pairing-friendly elliptic curve which also opens up other opportunities with zero-knowledge proofs) and a few other small additions would I believe enable EOSIO blockchains to host a wide variety of consensus protocols as its mechanism of choosing the best current blockchain and advancing the last irreversible block, and all without changing the underlying protocol each time a new consensus algorithm was selected. It would still require custom plugins (outside of WebAssembly code) specific to each consensus algorithm to be hosted in the nodeos instances, but this would only need to be run by the block proposers and validators. The remaining full nodes in the network would not need to bother and in theory the same nodeos executable could follow along a particular blockchain even as it swapped consensus protocols live.
One of the first protocols I would then want to try with such a modularized consensus system would be a variant of Chained HotStuff to enable the EOSIO blockchain to have BFT consensus with lots of validators while still maintaining a low latency to finality.
I agree that there shouldn’t be slashing for downtime. I think there would already be sufficient pressures through loss of income (perhaps permanently by having the block proposer or validator removed for bad performance) to keep them motivated to do a good job on the basic infrastructural matters.
This ties back to the social governance processes. There are ultimately humans evaluating their performance and deciding whether to continue rewarding them or to essentially fire them. I think that instead of having members / stakeholders (whoever it is at the bottom level) vote directly on block producer performance, it makes more sense for representatives they elect (e.g. through the hierarchical governance system within the Eden system) to have that responsibility.
I’m not sure about slashing for other types of bad behavior. There is certainly threats of punishment (loss of future income if nothing else) that should hopefully motivate them to not do bad things like signing blocks that are incompatible with one another from the perspective of the consensus protocol (which if a sufficient number of validators do could lead to a violation of finality in which different nodes think that two different incompatible blocks are final). These finality violations are a much bigger concern than temporary denial of service due to some block proposers or validators being down. It could lead to someone believing they received payment for some good / service / trade that they undo only to later find out that the payment never really happen on the “real” canonical chain.
So I think it may make sense to force the validators to post bonds, which they can only withdraw by stepping down and waiting a sufficient amount of time for the system to determine there are no proofs of bad behavior submitted against them. If a proof of bad behavior is provided in time, then the bond can be seized and slashed (some percentage can be given as a reward to the person providing the proof but most of it should be burned to prevent manipulation by the validator). However, I think it only clearly makes sense for a cryptographic proofs of a very specific bad behavior. I’m not sure if I like contract-driven slashing mechanism for block proposer or validators on things that are subjective. For example, if people suspect a block proposer is selectively censoring transactions, should that be grounds for slashing bonds? Who gets to decide? Well I suppose it could likely involve representatives elected from the Eden governance system. But I think it probably is better for those representatives to just decide to fire the block proposer on suspected censorship and let them leave with their full bond rather than also seizing the bond on something so subjective and difficult to prove.
Note that all of this can be built as smart contracts on top of the EOSIO protocol. However, I think it is outside the scope of the Eden governance discussion (other than possibly using their elected representatives) because it involves governance matters specific to blockchain consensus itself: slashing bonds, evaluating block producer performance, protecting against censorship of transactions, and overall maintaining the integrity of the host blockchain that enables other smart contracts, including potentially critical governance ones like Eden, to faithfully execute. Obviously both governance systems (Eden and the governance specific to blockchain consensus) can be intimately tied together and should ultimately be considered holistically in an analysis of the overall system, but I find value in trying to keep in mind a line of separation between them. Furthermore, once we start talking about validating cryptographic proofs of bad validator behavior that contributes to the risk of finality violation, it gets into discussions of how the consensus protocol is designed to enable such proofs to be generated in the first place and ideally be short and simple to validate. I think those are interesting things to explore on the technical side as well.