[IDEA] Mechanisms to address concerns regarding ease with which changes can be made on general programmable blockchains like EOSIO

arhag · April 10, 2021, 8:50pm

This post documents an idea for a few different mechanisms that could be adopted by EOSIO blockchains to address concerns some may have regarding potential abuse of the flexibility provided by the EOSIO platform to enable changes to core concepts of the blockchain that some may have believed to be immutable (or at least very difficult to change). I wrote this just to share the idea with the community and get feedback and it no way indicates that it is something actively being worked on or even on the roadmap for any development group working on EOSIO code (that I know of as of this time anyway).

Summary

EOSIO blockchains move a lot of the behavior people expect to be part of the “core” protocol of a particular EOSIO blockchain to the WebAssembly layer that permits easy upgradability. This provides a tremendous advantage by enabling EOSIO blockchains to be adaptable and to support a wide variety of experimentation across multiple EOSIO blockchains while each still build on top of the same robust underlying base protocol. However, adaptability can come into conflict with predictability which also provides benefits. In particular, some users of the blockchain would be happier to know that it is difficult to change some of the tokenomics of the blockchain they are using.

Fundamentally, there is no way to guarantee that the value of the assets one holds on a particular blockchain will continue to be preserved on a particular branch of the blockchain that follows the rules they prefer. However, it may be useful if tools were provided to guarantee to a node operator that their node will not automatically follow a branch of the blockchain that violates the rules they prefer. I propose two concrete mechanisms that can be added to EOSIO blockchains to facilitate this: one (which I call “protocol change events”) is an objective mechanism that all nodes in the network must follow and therefore requires changes to the EOSIO protocol to support; the other (which I call “subjective protocol restrictions”) is a subjective mechanism enabled through a nodeos plugin which enables any given nodeos operator to configure their own rules that they don’t wish to see violated on the blockchain that their nodes automatically follow. The idea is to use these mechanisms to provide a node operator assurance that if particular changes occur on the blockchain, their node will not automatically follow it unless they pre-agreed to the change by configuring their node appropriately (e.g. by adding some hash to a whitelist). If they did not agree beforehand to a change which does in fact occur on the blockchain, effectively it will appear as if their node is stuck. This alerts the node operator that something is wrong and they can then decide whether to agree to the change (which allows their node to continue syncing with the blockchain) or to take some other action like configuring their node to follow some other hard fork of the blockchain that follows rules they agree with.

While it is theoretically possible to construct a hard fork of an EOSIO blockchain currently, it is not easy. So I also propose new mechanisms (specifically two data-driven protocol features) which can ease the process for some part of the community to organize a hard fork of an existing EOSIO blockchain. By reducing the burden of creating a hard fork, this would make it more likely for acceptable options to exist for the node operator who is stuck because their node will not follow a branch of the blockchain that implements changes they do not agree with. In addition, with the technical barrier to create a hard fork lowered, the threat of creating a hard fork of an existing EOSIO blockchain becomes more credible to the existing powers that are given governance control of that blockchain who may then think twice before carrying out a contentious change in the first place. (The threat comes from the fact that the users easily migrating to the forked chain may cause a corresponding drop in value of the assets on the original chain.)

With the above mechanisms and tools properly in place, a node operator could theoretically configure their node to, as an example, not automatically follow the EOS Public Blockchain if the BPs on that chain increase the inflation rate to above 5% per year or issue new EOS tokens. Their node would stop and then the node operator could investigate what went wrong. If the node operator decides the trigger that stopped their node was actually acceptable (e.g. it was a system contract upgrade that was benign and did not attempt to bypass the safety mechanisms in place to ensure the inflation rate did not exceed 5%), then they could approve that specific change and allow their node to continue synchronizing with the rest of the network and continue following the blockchain (at least until the next trigger that stops the node). If they determine the trigger was actually due to behavior they consider unacceptable, e.g. hyper-inflation of the EOS token, then: first, they would know that at least they wouldn’t be tricked into accepting hyper-inflated EOS tokens without their knowledge in exchange for some other asset of value; second, they could easily configure their node to instead follow some hard fork of the blockchain where the unacceptable behavior does not occur (assuming some part of the community already went through the effort to organize such a hard fork); and third, even if such a hard fork didn’t exist, they would have the tools accessible to attempt to organize a hard fork that they find acceptable themselves and they hope to convince enough people to use it so that the tokens on that fork end up actually having some value.

Background

EOSIO blockchains are designed to be quite flexible. This includes putting features that many consider core to blockchains (e.g. the existence of the main token, its tokenomics, the governance mechanism that determines how blocks are produced) at a layer in the stack (the WebAssembly layer) that is above the base protocol layer and is easier to modify (does not require a hard fork). The fact that changing the tokenomics of the main token of the blockchain (e.g. increasing the inflation rate of EOS on the EOS Public Blockchain) does not need to be done through a hard fork may scare some people who want to have assurances on some aspects of those tokenomics (e.g. a hard upper limit on the rate of inflation).

The reality is that no hard limits are possible to guarantee. The rules can always be changed (despite any technical hurdles) simply by hard forking the blockchain to change the rules and then achieving social consensus to redefine which blockchain the token with the vast majority of the value refers to. Not even Bitcoin’s 21 million hard BTC cap is safe from such a change. However, because the 21 million BTC cap is part of the base Bitcoin protocol, a change to that number necessitates a cumbersome process called a “hard fork” which requires all nodes to take explicit action (e.g. upgrading their node software) for their local nodes to continue following the version (or branch) of the blockchain that changed the rules.

An EOSIO blockchain similarly has rules that are part of the base protocol that can only be changed through a hard fork (sometimes also referred to as a consensus protocol upgrade). However, for EOSIO, due to its more general architecture, there are fewer things that fall into this category. For example, increasing the length of EOSIO account names would require a hard fork. But things like changing the voting mechanism behind the DPoS (Delegate Proof of Stake) governance or changing the rate EOS inflation would not require a hard fork.

Benefits of generality

I personally believe putting more functionality into the WebAssembly layer is a good architecture for the blockchain. There may be some costs in performance but I think much of that can be mitigated through reasonable design of the interfaces between the WebAssembly layer and the base layer as well as simply putting engineering effort in making sure the WebAssembly engine is fast.

By keeping the base layer simple and general it enables the same platform that implements the technically hard parts of blockchains and consensus to support a wide variety of specific blockchain ecosystems that can experiment and explore radically different ideas. It also enables a particular blockchain to experiment over time by making it easy for it to change seemingly fundamental aspects of the blockchain with relative ease which allows it to be more adaptable and likelier to survive changing environments.

But that greater adaptability is not strictly only a positive thing. People also want to have confidence that some aspects of the blockchains they rely on are unlikely to change. There is a balance that needs to be met between greater adaptability on one side and on the other side to provide users predictability and confidence that a particular blockchain ecosystem is a stable platform that they can invest in and build on.

Providing a general architecture that enables greater adaptability is not necessarily a disadvantage in finding an appropriate compromise in this struggle between the two sides. Just because you have some tool does not mean you have to use it. On the other hand, one may argue that if one side (proponents of more frequent change in the name of progress) are given more powerful tools to support their mission compared to the other side (those who want to take a more conservative approach to give users of the platform confidence in what they depend on), then the tools could be argued to bias too heavily towards one side. And as will be discussed shortly, the side that wants to take the more conservative approach could benefit from some additional tools and blockchain mechanisms that currently do not exist in the EOSIO ecosystem. Also, these tools can be more widely beneficial and provide value to parties that are independent of the two prior stated sides of the struggle; for example, one of the mechanisms that will be discussed could enable better testing of new sophisticated features that are being considered for adoption by the blockchain.

A particular concern to address

It may help to discuss a concrete example of the kinds of struggles that may exist in finding a balance between greater adaptability and better predictability.

One example is regarding the tokenomics on the EOS Public Blockchain. It appears there is a general consensus in the EOS community that the maximum tolerated inflation rate on EOS is 5% per year. This was actually codified in the system contracts upon launch of the EOS network which enforced a fixed 5% inflation rate (albeit with 4% of that going to the eosio.saving account which was not being used for anything). But because these rules are enforced (at a technical level) in the system contract which allows a 15 out of 21 multisig of the current active block producers (BPs) of the EOS blockchain to change that inflation rate. And in fact this was done. It was dropped down from 5% to 1% at some point (and also the accumulated funds in the eosio.saving account were burned) and the inflation rate has remained at 1% per year since then (at least as of April 10, 2021). But the same mechanism (the eosio::setinflation action) that was used to reduce the inflation rate could be used to bring it back up to 5% again (or to even increase it past 5%).

Why does the eosio::setinflation action even allowing raising the inflation rate above 5% per year? Wouldn’t it be really easy to add a check into the action to enforce an upper bound? Yes, it would. But it is also meaningless because if the BPs were in consensus they could just update the system contract code to remove that restriction. And it is valuable for them to be able to update the system contract code in order to fix bugs or add new features. The system contract on the EOS blockchain has been updated many times throughout the history of the chain (sometimes to just fix bugs and other times to introduce new features like PowerUp) without requiring a hard fork which is a slow process as it requires waiting on nearly all ecosystem participants to update before the change can be activated. And when it comes to security bugs the community may not have the luxury of time to carry out a system contract upgrade through a hard fork.

Reactive protections

The reality is that if there actually is community consensus on a 5% upper bound on the EOS inflation rate, it is not any hard technical means that prevents it from being violated (other than require 15 out of 21 multisig of the elected BPs of the network). The protections are enforced through social means. If the BPs understand it may be taboo to increase inflation beyond some limit, they are motivated to properly gauge community sentiment on the matter (which may involve a long tedious process) before approving the multisig proposal that executes the change. Otherwise, they could risk undesired consequences as a result of supporting the change. The simplest example of such an undesired consequence is getting voted out by the token holders and losing the future income stream from the network. In a more extreme example, the community could organize a hard fork to revert the changes and punish the BPs that voted for it; this was demonstrated in 2020 by the split of Hive from the Steem blockchain.

But notice that while the protections are fundamentally social, they involve using technical means to carry it out. For example, the DPoS governance that allows voters to vote out BPs they do not like requires a lot of smart contract code that tracks staked amounts and aggregated votes to rank BPs. And there is a lot of technical sophistication required to hard fork a blockchain successfully.

Preemptive protections

Furthermore, additional tooling can benefit users impacted by the undesired changes by focusing on preemptive solutions rather than only reactive ones.

Consider that if BPs colluded to hyper-inflate a token of some blockchain, they could then sell those hyper-inflated token to some victim in exchange for BTC. Even if the victim discovered the hyper-inflation change a few hours later and decided to stop supporting that blockchain by selling their tokens, the damage may have already been done. The price of the token may have crashed as others discovered the hyper-inflation as well, so the victim would not be able to get much value out of the tokens they received. And the BTC they traded for those tokens would be gone for good with no way to reverse it. The best they could do is organize a hard fork of the blockchain prior to the hyper-inflation event and hope that the version of the token on that forked blockchain ends up achieving the value it once had.

Also, notice that there is considerable difficulty in organizing the hard fork of the chain. This is not just a matter of the technical difficulties but also the social difficulties in getting enough people in consensus on which block to fork from, which set of block producer candidates replace the active set of BPs (which is needed otherwise irreversibility would not advance on the new forked chain because the active BPs as of the fork point would likely not support that fork of the blockchain), and determining what other modifications may be needed to ensure the undesired event does not just repeat again on the new fork (e.g. perhaps slashing the tokens of voters who supported the hyper-inflation change).

Any tools that make this process of organizing a hard fork easier helps with the reactive protections but it may also help to preempt the undesired change to begin with. If barrier to a hard fork is lowered, then the hard fork becomes a more credible threat to the powers that are given governance control of the existing blockchain who may then think twice before carrying out a contentious change. The threat comes from the fact that the users easily migrating to the forked chain may cause a corresponding drop in value of the assets on the original chain. And without sufficient value on the original chain, the powers that control it don’t have much to control.

New tools and mechanisms can also help the victim of the hyper-inflated tokens described in this section. If their node, which they could use as the source of truth of the activity of the blockchain, stopped syncing with the blockchain as soon as the hyper-inflation event occurred, then they never would have confirmed receiving the hyper-inflated tokens and would have never exchanged their BTC for it. So they would be less impacted by the hyper-inflation change if their node software was sophisticated enough to detect that change immediately. In addition, if all nodes that cared about avoiding such a hyper-inflation change all stopped at the same block due to the same event, then the last good irreversible block they each have (which should be the same for all of them) can likely act as a Schelling point for the block from which they will build their new fork from (assuming they decide to hard fork the blockchain to resume production with the original rules intact). Of course the users of these nodes will still require a lot of communication to confirm which block they wish to fork from and to organize all the other aspects of the fork (e.g. which set of block producers are chosen to resume the blockchain), but the social process does become a little easier when the group has a straightforward way to agree on at least the block they wish to fork from.

[Due to post length limitations, the remainder of this proposal will be left in a subsequent comment.]

arhag · April 10, 2021, 8:13pm

Mechanisms and tools to enable the EOSIO community to better address the concern

I will describe three specific mechanisms/tools that could be added to the EOSIO ecosystem to better address the concern described above regarding the governance representatives of an EOSIO blockchain (e.g. the BPs) raising the inflation rate beyond some threshold value (and more generally other concerns similar to that). Some of these mechanisms would involve changes to the EOSIO protocol. Others could be implemented with plugins in nodeos which remains compatible with the existing EOSIO protocol.

The three specific mechanisms/tools that I will discuss are the following:

Protocol Change Events: A mechanism for a privilege smart contract to emit a protocol change event which is simply a cryptographic hash that demands all validator nodes to include that hash within their local whitelist in order for their node to continue following that branch of the blockchain.
Subjective Protocol Restrictions: A plugin in nodeos which is able to track specific actions and avoid following a branch of the blockchain that includes an action matching its filtering rules (unless it also matches its exception rules).
Hard Fork Mechanisms and Tooling: A few new (data-driven) protocol features added to the EOSIO protocol which enable users to more easily create hard forks of an existing EOSIO blockchain without requiring any source code changes to nodeos for each specific hard fork.

Protocol Change Events

The EOSIO protocol could be updated to add a new host function callable only by privileged smart contracts which simply takes a cryptographic hash as input. By calling this host function, the smart contract (typically the system contract) is indicating that it carried out some change to the rules that is significant enough that it expects all nodes in the network to intentionally opt-in to the change if they wish to continue syncing with the blockchain. Node operators can opt-in by including that cryptographic hash into a local whitelist of accepted protocol change hashes they maintain.

The cryptographic hash would be expected to commit to the relevant details of the protocol change (this is of course the responsibility of the system contract that calls the host function). This enables a node operator to know exactly what they are agreeing to by including a particular hash into their whitelist. The expectation would be that the details that precisely determine the cryptographic hash would be known well ahead of the actual call of the host function with that hash by the system contract so that the community can discuss the proposed change before hand and distribute the hash that node operators are expected to include in their whitelist if they agree to the change. If all goes well, none of the nodes should experience disruption of service because they would all add the expected hash into their whitelist well before the system contract emits a protocol change event using that hash. This is a similar dynamic as consensus protocol upgrades EOSIO has experienced in the past (e.g. EOSIO 1.8 and EOSIO 2.0) where node operators were expected to upgrade their nodeos binaries before the BPs activate the protocol features introduced in that version of the software. However, this is a more dynamic and general version of that in which no upgrades to the nodeos binaries are necessary.

What happens if a node operator does not include the hash in time prior to protocol change event occurring? Their node would stop syncing with the network and be stuck at the latest block prior to the one that emitted the protocol change event. This would likely be because the node operator didn’t realize a protocol change was coming. The disruption of their service would alert them that something is wrong and they would investigate to decide what they want to do. In most cases, they would likely decide to embrace the protocol change by adding the hash into their whitelist and allowing their node to resume syncing the blockchain.

However, in some rare cases the node operator may disagree with the protocol change and wish to take a different path. For example, they may want to follow a fork of the blockchain that maintains the original intent that they agreed to and does not make the changes encoded in the emitted protocol change event. This will not happen automatically because the BPs of the original chain would have moved on to branch of the blockchain that did emit the protocol change event. And they are disallowed to double confirm conflicting blocks, which means finality would stop advancing on the branch of the blockchain that does not include the protocol change event. So the node operator’s only remaining option would be to update their node to respect a hard fork of the blockchain that forces the active BP set to be replaced by a new one that is willing to continue the original blockchain without making the protocol change that the node operator disagreed with. This process can be made easier using the third part of this proposal (Hard Fork Mechanisms and Tooling) that will be discussed later.

So how can this new mechanism help with the specific concern brought up earlier of BPs on EOS potentially agreeing to use eosio::setinflation to set an inflation rate higher than 5%?

Well the system contract could be modified to not disallow setting the inflation above the threshold of 5% per year but to instead emit a protocol change event if the inflation was raised above that threshold. The protocol change hash would commit to the particular value of the new inflation rate (among other things) so that node operators that add the hash to their whitelist know they are agreeing to an increase to a particular value (e.g. 6%) rather than some other value (e.g. 30%). However, if the inflation rate was decreased, or the inflation was increased but to a value no greater than 5% per year, then the system contract would not bother emitting the protocol change event. This provides a nice balance in allowing for some degree of adaptability without a lot of friction (i.e. friction like the slow process of ensuring almost everyone is on board with the change and they have added the hash to their whitelist) but also providing meaningful protections to the community against excessive inflation.

Of course those protections would only be meaningful if the other loopholes to increase inflation without bound are closed. For example, the token contract would need to be able to instruct the system contract to emit a protocol change event any time the eosio.token::issue action was used to create new EOS tokens. Furthermore, any updates the token contract and system contract (and less obviously updates to the contracts on the eosio.msig and eosio.wrap accounts as well due to the fact that those accounts are privileged) would need to be protected otherwise the BPs could just replace those contracts with versions that do not enforce these protective measures. This does impact adaptability a bit. As was mentioned earlier, frequent updates (relative to consensus protocol upgrades anyway) to the system contract have been done throughout the history of the EOS Public Blockchain. These were either to fix bugs or to add new useful features.

Perhaps the community of a particular EOSIO blockchain may decide that slowing down the process of adding new features to the system contract (or token contract) or fixing non-critical bugs in those contracts is a worthwhile cost to pay for the benefit of meaningful protections against the risk of excessive inflation. (Security critical bugs are a tricky case where there is a need to act quickly, potentially even with new code that is not fully public, but unfortunately the system cannot distinguish between bug fixes that preserve the intent of the existing code versus more significant changes that completely change that intent. There may be clever mechanisms that can be put in place involving time delays to give people notice of some upcoming change to a contract without knowing exactly what the new code is as well as requiring additional approval from some set of trusted contract auditors in order to strike the appropriate balance. In the extreme case, BPs can favor denial of service of the blockchain rather than compromise of the integrity of the blockchain by just updating the code immediately, then revealing the new code to the public, and then just expect the community to add the protocol change hash to their whitelist as soon as possible to resume syncing with the blockchain.) Furthermore, breaking down the system contract into more granular pieces could allow some components that do not meaningfully impact the tokenomics of the blockchain to be updated more freely. For example, if the EOSIO protocol was updated to enable better composability between smart contracts on different accounts, then perhaps the REX and PowerUp components could exist in their own smart contracts (deployed on unprivileged accounts) and could be updated with just a 15 of 21 BP approval without emitting a protocol change event.

Regardless of the particular choices a specific blockchain community makes, this new proposed mechanism of emitting a protocol change event enables each EOSIO blockchain community to better find the balance in the trade-off that they find acceptable.

Subjective Protocol Restricions

If the EOS community decided it would be too cumbersome to emit a protocol change event every time the system contracts were updated, does that mean the emitting a protocol change event when the inflation is increased beyond 5% a year is meaningless? Not necessarily. There would still remain the loophole to bypass that restriction by just changing the code in the system contract, and the BPs could do that without forcing all nodes to stop and explicitly opt-in to that change to the system contract. But some could individually choose to have a higher level of protection by requiring explicit approval of each system contract upgrade (and without that explicit approval their local node would refuse to follow the blockchain that includes the system contract change).

This leads to the second specific mechanism that I call “subjective protocol restrictions” which enables each EOSIO node to subjectively enforce certain restrictions on the activity happening on the EOSIO blockchain via a nodeos plugin that is appropriate configured. If those restrictions are violated, that nodeos would refuse to follow the branch of the blockchain that violates those subjective restrictions. In practice, this could be achieved by having a set of actions that the plugin is aware of, tracks based on filters, and then if it gets a match, it handles it in some appropriate way.

For example, the plugin could be aware of the eosio::setcode action and it would track any instance of that action (inline or not) and consider it if the account to which the code was being set matched a particular account name in some locally configured list (e.g. eosio, eosio.token, eosio.msig, and eosio.wrap). If it got a match on that action, it would construct a hash from the relevant arguments of the action (committing to the account name and hash of the new code at least) and then compare that to some other locally configured list of approved smart contracts. If the hash was not in that exception list, then it would reject following that branch of the blockchain.

As another example, the plugin could also be aware of the eosio::setpriv action. If it ever encountered that action with the is_priv value set to a non-zero integer (which indicates that the referenced account is being elevated to a privileged account), then it would look up the account referenced within the action in some locally configured list of EOSIO accounts that are approved to be privileged (e.g. eosio, eosio.lost, eosio.msig, eosio.wrap).

With the above two subjective protocol restrictions in place with appropriate configuration, a node operator could, for example, rest assured that their node will not automatically follow a branch of the blockchain in which the BPs have somehow escalated the capabilities of the various system-level contracts on the EOS Public Blockchain beyond what the contracts previously allowed when the node operator last audited the state of the contracts of the EOS Public Blockchain. If a change triggering the subjective protocol restriction occurred (e.g. BPs deploying a new system contract with a bug fix), it would stop this node until the node operator had a chance to audit the change, determine that it was an acceptable one, compute the hash corresponding to the specific change, add it to their locally configured exception list, and then allow the node to automatically resume syncing with the network.

So with the above subjective restrictions in place and the objective restrictions that a protocol change event is emitted by the system contract when the inflation rate is increased beyond 5% per year and (indirectly) emitted by the token contract when any new EOS tokens are issued into existence, the node operator can rest assured that their node will not automatically follow a branch of the blockchain in which the community inflation rules were violated.

But the node operator could go further with the subjective protocol restrictions. Perhaps they think the threshold should be 3% and not 5%, but the deployed system contract only emits the protocol change event when the inflation is increased beyond 5%. The plugin might also be aware of the eosio::setinflation action and would consider it a match if the annual_rate is greater than a threshold value. Then the node operator could configure the plugin appropriately so that their node does not automatically follow the branch of the blockchain in which the inflation rate is increased beyond 3% per year.

Hard Fork Mechanisms and Tooling

As mentioned earlier, there may situations in which the node operator evaluates why their node stopped due to either a protocol change event or a subjective protocol restriction and concludes that they do not agree with the change that happened on the blockchain. At that point their only remaining option would be to leave the community that decided to go in the direction they do not agree with. So perhaps they temporarily follow the branch of the blockchain they do not agree with long enough to sell their assets on that blockchain, extract whatever value they can from it, and move that value to some other community that is more aligned with their values. However, it may be the case that there is a large subset of the community that agrees with them that the change was not a good one and they wish to return to the original state of the blockchain prior to that contentious change occurring except with just the minimal changes to that state necessary in order to prevent that contentious change from just repeating immediately again when they resume operation of that fork of the blockchain. So this subset of the community may be willing to organize a hard fork of the blockchain which branches off from some particular block in the original blockchain (so that they share most of the same history initially) and the node operator may wish to configure their node to follow that particular hard fork of the blockchain.

Current state of creating hard forks of EOSIO blockchains

One way of specifying which fork of an EOSIO blockchain a particular node should respect is through the existing feature of checkpoints. Checkpoints allow specifying the particular block ID expected as of a particular block height. This prevents the node from going down a branch of the blockchain that does not satisfy that checkpoint. However, this feature is useful to resolve violations of finality (which may occur if enough BPs do the wrong thing and double confirm incompatible blocks). It does not help with following a branch of the blockchain that intentionally violates the existing protocol rules (which is required to carry out a hard fork).

The way to achieve a hard fork with the current EOSIO tooling is to start nodeos from a portable snapshot that is taken after the hard fork point. But this does not even address how to construct the hard fork state that one could generate a portable snapshot of in the first place. That right now requires exporting the nodeos state in a usable format, building custom tooling to arbitrarily mutate that state, loading that modified state into nodeos, possible producing some blocks to do further clean up using EOSIO transactions directly rather than the custom tooling which is more error prone, and then finally generating a new portable snapshot of the finalized state for the new hard fork.

It would be nice if there was better tooling to make this hard forking process easier. It would especially be nice if there were mechanisms officially supported in the EOSIO protocol to enable common hard forking strategies while preserving the history of the blockchain. The problem with the snapshot approach is that it then becomes impossible to have a continuous replay of the blockchain that crosses over the hard fork point.

Background on EOSIO protocol feature foundations

EOSIO 1.8 introduced a general foundation for consensus protocol upgrades via a concept called “protocol features”. A protocol feature is represented by a cryptographic hash included in the block header which signals the activation of the protocol feature at the start of that block. Protocol features can be configured on the local nodeos to either require pre-activation or not. If it requires pre-activation, then a BP cannot simply add the protocol feature hash to the block header whenever desired. Instead, the block prior needs to have a call to a privileged host function that pre-activates that particular protocol feature (identified by its cryptographic hash) for the protocol feature hash in the next block’s header to be considered valid (in fact at that point it is considered necessary as well). This enables delegating the governance decisions of when to activate a particular protocol feature to smart contracts (in the case of the EOS Public Blockchain, it is simply done through 15 of 21 multisig approval of the BPs) rather than allowing any one of the active BPs to force the activation unilaterally. All of the EOSIO protocol features introduced so far except one require pre-activation. The one exception is the first protocol feature introduced (PREACTIVATE_FEATURE) which actually enables the pre-activation of protocol features in the first place (obviously that one couldn’t require pre-activation since there is a bootstrapping problem there). Also, all of the EOSIO protocol features introduced so far are of the builtin class of protocol features. This means that they activate some specific built-in protocol feature that the nodeos executable is expected to know how to execute and there is no degree of control permitted on what the activation does.

The EOSIO protocol features mechanism was designed to enable a broader class of protocol features. These classes of protocol features other than the builtin would be data-driven. The protocol feature class would define the particular template of activation behavior that nodeos is expected to be able to execute upon activation of the protocol feature. But the specifics of how to execute that would be driven by data that is cryptographically committed to by the protocol feature hash. This is in fact the reason why protocol features were designed to use SHA256 cryptographic hashes rather than a simple integer; they create the foundation to enable data-driven protocol features in the future. (You can read more about this and other aspects of the protocol feature architecture in more detail here and also here and here).

New data-driven classes of protocol features

So one example of a data-driven protocol feature might be one that forces the immediate replacement of the currently chosen schedule of block producers with a new prescribed schedule (and again that new schedule would be prescribed as part of data-driven protocol feature definition and committed to by the protocol feature hash). Such a protocol feature could enable the community to coordinate a process to resume from even indefinite denial of service by more than a third of the active BPs without requiring a recompilation of anyone’s nodeos executables (although everyone would need to add the appropriate JSON file to their config/protocol_features directory). Note that such a class of protocol features does not currently exists in the EOSIO protocol. While the foundations to enable that are in place with the protocol feature mechanism, the EOSIO protocol would still need to be updated to include that particular class of data-driven protocol features (the one to replace the active BP schedule) and that does require source code changes to be implemented in nodeos. But if that was implemented and everyone upgraded their nodeos executables, then each specific instance of replacing the active BP schedule on EOSIO blockchains could be supported without further code changes required.

Another example of a data-driven protocol feature change could be one that deploys some specified contract code to a particular EOSIO account, and optionally sets that account to privileged as well. Such a protocol feature could be useful if the BPs accidentally deployed a system contract change that bricked the blockchain; it is theoretically possible (though unlikely) to deploy a system contract change that effectively makes block production no longer possible on that blockchain (at least not without a hard fork). But this data-driven protocol feature could be the hard fork that allows the community to recover from such an event and to do so without requiring any changes to the nodeos executable (again after support for such a class of data protocol feature was added to nodeos which currently does not exist).

The two classes of data-driven protocol features I described above were used as an example of recovering from unfortunate (and unlikely) events that could prevent the blockchain from continuing to advance finality (more than a third of the BPs permanently down or bricking the system contract). But the same features could potentially (with some augmentations) be used to hard fork a blockchain away from the control of the current BPs without their consent. It would be easier to do if at least one of the active BPs participated, but the protocol feature to replace the BP schedule could be designed to use the new BP schedule when evaluating the current block as well so that it would be possible for a BP in the new schedule (even if they are not in the old schedule) to create a block which activates the protocol feature and where that block could be accepted by existing nodes in the network assuming they added that protocol feature JSON to their config/protocol_features directory.

How hard forking an EOSIO blockchain could potentially work

Those new protocol features (assuming they were added to the EOSIO protocol) could then enable (with some additional tooling support) some part of the community to come to social consensus on a hard fork of an existing EOSIO blockchain and to execute on that hard fork through the following processing:

Choose a set of new BPs (who already have registered as BP candidates on the existing EOSIO blockchain) to act as the transitionary BPs to carry out the hard fork.
Agree to the changes that need to be carried out as part of the hard fork as well as the set of auditors to verify the changes were carried out as planned. The changes that need to be agreed on may include:
- which block of the original blockchain to base the new fork on (this could also be a decision that is delayed until later so that a more recent block is chosen);
- the schedule of the transitionary BPs;
- the temporary replacement of the system (if necessary) to enable the transitionary BPs (or perhaps other trusted actors) to carry out the hard fork changes;
- the actual sequence of actions (not necessarily precisely specified) that need to be carried out on the forked blockchain (this may include replacing the system contract, or other contracts, at the end of the sequence with the appropriate code to be used in normal operation);
- the set of BPs and their schedule to switch to from the transitionary BPs (or they may choose to stick with the same transitionary BPs);
- how to come to consensus on which block (after all of the above is done) to use as the checkpoint block which all the auditors base their audit on.
Actually carry out the execution of the hard fork:
- One of the transitionary BPs creates a block based on the agreed upon fork block which includes the data-driven protocol feature to replace the BP schedule with the transitionary BP schedule that was agreed upon and may also include a data-driven consensus protocol feature to replace the system contract with a particular (already agreed upon) WebAssembly contract.
- The transitionary BPs build upon the block activating the protocol features (this obviously means they added the data-driven protocol features to their config/protocol_features directory) and produce enough blocks so that the last irreversible block can advance past the activation block.
- The transitionary BPs (or some other trusted actors that were decided upon beforehand) carry out the sequence of actions that was already agreed upon and then switch out the BP schedule if necessary.
Once all the expected steps have been completed and this defines the checkpoint block to use, the selected auditors verify that all of the required steps were properly followed as of the checkpoint block and then they provide an attestation signature on the ID of the checkpoint block which attests to the fact that they believe all of the agreed upon expectations of that were communicated regarding this hard fork were properly met as of the checkpoint block (the attestation signature also signs off on the protocol feature hashes that were activated to execute this hard fork).
Finally, the node operators in the network who wish to follow the new fork verify that a sufficient number of the auditors have signed off on a particular checkpoint block ID (and associated protocol feature hashes), they add the checkpoint block ID to their checkpoints list in config.ini and add the necessary (data-driven) protocol features to their config/protocol_features directory, and then allow nodeos to automatically synchronize with the network and follow that particular fork of the blockchain. Once nodeos reports that its last irreversible block height is past that of the checkpoint block, the node operator knows that they are officially on the appropriate hard fork of the blockchain. Note that if their local nodeos has already advanced the last irreversible block past the fork point before they start this process then they may have to restore it to a prior state from portable snapshot that was prior to the fork point. However, if they were following the process correctly, they should have already paused their node from following the original blockchain prior to the agreed upon fork point (perhaps this would have been done automatically for them due to protocol change events or subjective protocol restrictions).

There is still a lot involved in organizing a hard fork. But the mechanisms proposed here at least ease the technical burden in creating a hard fork, which could hopefully provide people greater options in choosing a version of the blockchain that is more aligned with their values.

Furthermore, these technical mechanisms provide value for even things like testing out some proposed changes to the main blockchain. The testers could fork the blockchain into a test branch of the blockchain in which the assets there are not recognized to have any value. They can use the privileges accessible to them now that they are acting as the BPs of this test blockchain (and possibly also replaced the privileged system contract) to take control of any other EOSIO accounts to carry out their tests.

jdheeter · April 10, 2021, 8:55pm

I like the idea that you could specify rules you would like your node to always follow, even if technically those are “soft” rules encoded in system contracts. We could have a repository of these common rules on-chain, where nodes could signal “I support this rule and operate this node with these rules enabled”.

arhag · April 10, 2021, 9:00pm

Now I am thinking that eventually (not necessarily on first iteration) that the subjective protocol restrictions should be implemented as WebAssembly code running in a general subjective protocol restriction nodeos plugin. That way those arbitrary rules could be shared more feasibly.