Max Shannon, Digital Asset Analyst at major European digital asset investment firm CoinShares.
Ethereum has an Evolving but Structured Roadmap of Protocol Changes.
Before its launch in 2015, Ethereum developers already had a stated ambition to replace its Proof-of-Work (PoW) consensus mechanism with an alternative one: Proof-of-Stake (PoS). While it was deemed too technically risky to start the network with anything other than PoW, the eventual migration to PoS has been a major development goal of Ethereum developers and a highly anticipated milestone on their roadmap.
The migration to PoS is considered to be an extremely important milestone because developers regard it as a key prerequisite for several subsequent development goals, and because the Ethereum community believes the reduction in energy usage is worth the trade-offs in protocol attributes it generates.
Though subject to some changes, the development roadmap itself predates the launch of Ethereum and has existed since the network was only in its testnet phase. Many of the changes described in the evolving roadmap have already been implemented, but the migration to PoS, one of the most challenging and involved modifications, has yet to be completed and is now delayed.
The reasons for the delays are many and complex, but they more or less reduce to PoS proving much more technically challenging to safely implement than developers first thought. Many prototypes have been proposed and evaluated, but problems have kept emerging, necessitating ongoing multi-year bouts of bug fixes and redesigns.
We can even observe the serial delays of the migration, often referred to as the Merge, from blockchain data. Confident that the PoS implementation is ever near at hand, the Ethereum protocol contains a hard-coded exponential increase in mining difficulty. This mechanism is called the difficulty bomb and is designed to cause mining difficulty and revenues to disconnect, forcing miners to abandon the PoW chain, leaving the alternative PoS chain as the only viable one.
Three separate ‘detonations’ of the difficulty bomb can be observed in the above figure. They are visible both as exponential increases in block times, and as rapid divergences between hashrate and difficulty. However, since PoS has never been ready to implement at the time of the detonation (or upcoming detonation) Ethereum developers have rolled back the difficulty bomb on five separate occasions.
All this being said, and while no specific date exists for the Merge as of the time of writing (and the Merge has indeed just been delayed again from H1 2022, tentatively to H2 2022), there are emerging signs perhaps warranting cautious optimism that PoS implementation might actually be forthcoming this time around.
So let’s have a quick look at what the next phase of Ethereum would actually look like.
Future sharding also rests on the Merge
In addition to PoS (The Merge), the second major part of Ethereum’s next phase is the introduction of sharding. Sharding is a blockchain protocol scalability technique whereby the protocol increases its throughput by splitting the blockchain into many blockchains (shards), allowing single computers to choose which of the many blockchains to work on.
Sharding allows the total throughput of the protocol to increase without increasing the computational demand of the individual computers working on it. In other words, Ethereum will be able to process a lot more information while still hoping to rely on relatively casual users providing distributed processing power through regular consumer computers.
In the world of computer science, this is referred to as a horizontal scaling technique. Horizontal scaling is characterized by increasing throughput/capability by adding more individual computers to a network. Its alternative is vertical scaling whereby increased scale is only achieved through increasing the throughput/capability of the individual network computers.
In the world of blockchain protocols, increasing block sizes or increasing block frequencies (reducing the targeted time between blocks) are examples of vertical scaling as they require that all computers participating must be very powerful (which is expensive). Conversely, sharding allows additional throughput/capability—which means the ability to process a lot more transactions and smart contracts per second, at much lower costs—by adding more network participants, assuming that they will all care about separate shards.
Adding PoS and sharding incurs important trade-offs in network attributes
When designing decentralized peer-to-peer networks there is no such thing as perfect solutions to problems. There are only costs, rewards, and trade-offs.
Conversely, when analyzing proposed changes to blockchain protocols it is very important to not get carried away by the optimism which often pervades the developer and user communities of the various networks (nor to fall prey to the pervasive pessimism common among competitors). So let’s have a brief look at the trade-offs the Ethereum community is willing to make in order to achieve increased throughput/capability on-chain and to reduce the energy use of the protocol down to effectively nothing.
The trade-off for both horizontal scaling of a blockchain network, versus keeping throughput low, is that the network as a whole becomes more like a client-server network than a peer-to-peer network, losing out on important decentralization benefits.
Why is that? Briefly explained, in order to be a full peer in a blockchain network—that is, someone who participates in the network without the need to trust any other network participant—a user must be able to fully verify every single event that happens on the network. With a multitude of blockchains (or a single huge one) to verify, the computational and bandwidth resources required to be a full peer increases dramatically, making fewer and fewer users able to afford the privilege of being full peers.
This results in the reintroduction of trust as all users who are now unable to verify all shards (or a huge single blockchain) must trust other users to tell them the truth about what happened on other shards (or on the huge blockchain they can no longer afford to self-verify).
A high level of decentralization is a sought-after yet traditionally hard-to-define quality of peer-to-peer networks. The reason for its desirability is that a network with as many peers as possible becomes impossible to shut down due to the huge number of participants, all of whom must be disabled for the network to be fully extinguished.
Whereas network anonymity techniques make it impossible to reliably measure the number of participants on a blockchain network, there are useful proxies we can use to evaluate its level of decentralization. In our opinion, the best one is the cost of being a full peer. The more costly it is to be a full peer, the less decentralized a network will be.
Any protocol change which increases the cost of being a full peer, therefore, reduces the decentralisation of the network. This is the tradeoff cost incurred for the benefits of increased throughput/capability on-chain.
To get a sense of the magnitudes of difference, consider that in order to be a full peer on Ethereum after the Merge, a user will need ETH 32 (~ [USD] 92,000 at the time of writing, 25 April 2022) plus dedicated computer hardware, possibly not costing more than an additional thousand dollars. In contrast, being a full peer on the Bitcoin Network costs less than [USD] 300 and requires no amount of bitcoin, however, the tradeoff incurred on the Bitcoin Network is that the amount of transactions its base blockchain can process is limited to about 7 transactions per second.
- Trust minimization
Discontinuing PoW mining also incurs important tradeoffs in return for a drastic reduction in energy consumption, which under our current global electricity production stack, also means massively reduced carbon emissions.
Broadly summarised, Ethereum will suffer reductions or elimination of censorship resistance, trust minimisation and decentralisation as a result of implementing PoS. It will also suffer a large increase in its attack surface due to its increased complexity of code. Hackers will have more exploits to seek for.
PoS reintroduces the requirement to trust other network participants when joining or re-joining the network. This is because staking is a quantity internal to the blockchain network. That is, you cannot know who has what stake unless you know which blockchain is the correct one. This means that before a user can validate whether the blockchain before them has been correctly executed, they must first trust someone else to tell them what the blockchain is in the first place.
This is a problem if a new user or a returning user is faced with a choice between multiple conflicting blockchains presented to him by a malicious actor. Since a PoS blockchain costs nothing to create, fake histories that are otherwise valid can be created and presented to outsiders en-masse by dishonest participants.
Work, on the other hand, is external to the system. This means that if two conflicting blockchains are presented to a new or returning user, they can trivially check for themselves which blockchain is the correct one simply by looking at the amount of accumulated work (the one with the most accumulated work is by definition the correct one). In a PoS system, the only way to get around this is to introduce checkpoints, which again, require trusting other participants to tell you what the correct blockchain was at various times in the past. PoS, therefore, creates a need to trust other network participants through multiple new avenues, which it must trade-off against its benefits.
- Censorship resistance and centralization
PoS also trades off its ability to resist censorship. Censorship Resistance, in this context and as defined in Cryptoeconomics by [Eric] Voskuil, means the ability of the network to resist the actions of a network participant trying to prevent some or all transactions from being entered into the transaction record. The only effective way to do this is to control more than 51% of block producers—miners in a PoW system, stakers in a PoS system.
An entity controlling a majority of block production can simply refuse to enter some or all transactions into the blockchain, effectively censoring any or all parties.
In a PoW system, miners need to consume a resource external to the system and also require external capital (hardware). This can be procured without the majority miner knowing anything about it, meaning that there exists a mechanism by which a censor can lose its place as a majority miner.
In a PoS system, no such recourse exists within the protocol rules. As soon as an entity achieves a majority stake in the system they will perpetually increase their proportion of the total stake and nothing can force them to sell any of their stake meaning that their position is impossible to dislodge.
The only way to recover from a situation like this is by recourse to a social consensus hard fork, which is just another way of saying centralised management by a select committee—which is by definition the opposite of decentralised.
PoS systems are also extremely complex compared to PoW systems, which vastly increases the attack surface of their networks. While the increased risks incurred by complexity are impossible to enumerate, any amount of increased complexity will cause an increase in risk, making it an important consideration to take into account.
In short, by implementing PoS and sharding, Ethereum trades off decentralization, trust minimization, and censorship resistance. On top of that, it will suffer from a much larger attack surface. In return, it gets vastly increased on-chain data throughput (scalability), and drastically reduced energy consumption.
Recent events suggest work towards the Merge is progressing
A major development step on the path towards the Merge was the launch of the Beacon Chain in December 2020. The Beacon Chain is an isolated PoS blockchain which, in the future, will act as the coordinator chain between all the Ethereum shards. It will act as Ethereum’s ‘consensus layer’, which is first designed to allow access to transaction data, and in a later version, to execute transactions themselves taking place on 64 planned separate shards.
There have been tests and practice merges, most recently on 23 April 2022, conducted to ensure the mainnet transition proceeds safely. The Kiln testnet merged on 15 March 2022 and incorporated the last major specification changes.
Although many of these tests have been successful, others have raised concerns and probably contributed to the latest delay of the Merge from H1 2022 to H2 2022.
A list of concrete tasks to be completed before the Merge can take place can be found here. While the list still includes a large number of unfinished tasks, many are also completed.
Ethereum’s future post-Merge is determined by users and investors
Proof-of-Stake has been estimated to decrease energy consumption by over 99.5%. The requirements for which blocks are created, validated, ordered, and added to the chain have shifted from an external resource to the network (energy) to an internal resource (capital). Sharding also has trade-offs in relation to decreased decentralization while increasing throughput.
Whether Ethereum’s trade-offs are worth it is another question entirely. It is a question that rests heavily on subjective perceptions of the real-world risks likely to be faced by Ethereum in the future, and at the end of the day, is a question that can only be answered individually by Ethereum users and investors.
Whenever it happens, the Merge (probably this year) will represent a major shift in Ethereum design, capabilities, and attributes. The Ethereum 2.0 moniker is therefore quite fitting—post the Merge Ethereum will be a completely different system.
This article was first published by CoinShares on April 26, 2022.