why is scaling up needed?
The demand for blockchain space has been increasing substantially over the past years, in particular with the rise of DeFi apps and the sudden explosion of the NFT market in 2021.
As a result, gas fees — the cost paid by users for Ethereum transactions, has increased dramatically as well. A basic transfer of Ether cost about $5 in November 2021, while a DeFi transaction (Uniswap for example) can cost more than $100. These high costs can be very discouraging for regular users and explain why a lot of blockchains with higher throughput and cheaper fees have been in the spotlight in recent months, for example Solana, Cardano, or Binance Smart Chain.
To accomodate increasing demand, scaling is necessary. Scaling simply means offering more transaction per second (TPS). There are two ways for blockchains to do that.
The base layer solutions
The first one is to either increase the block size, decrease the time between blocks, or both. That’s the solution adopted by high TPS blockchains such as Binance and Solana. Think of transaction data as cars on the road and nodes as checkpoints. There is only so many cars per hour a checkpoint can process. If you want to handle more traffic, you need to build bigger checkpoints.
In blockchain terms, that means you need more powerful hardware for nodes to process transactions and more disk space to store data. When you crank up requirements for nodes, at some point you push away regular individuals and end up with only a few dozens nodes run by corporate entities. In other words, you centralize, which means you lose censorship resistance and security.
Ethereum’s security model is to rely on tens or hundreds of thousands of nodes operated by individuals on their personal computers. Bitcoin’s model is the same. Low requirements for running a node means more participants: everyone can do it at little cost. This model is very resistant to censorship: it’s difficult to coerce ten thousand individuals scattered across the globe, much more than to coerce a dozen private companies.
The decentralized model is also more resistant to technical failures. Recently, the Solana chain experiences an outage for about 17 hours. In contrast, the bug that caused the Ethereum’s chain to split temporarily last summer did not cause the network to go offline but stalled some transactions.
The layer 2 solutions
To avoid compromising either security or decentralization, another scaling path can be followed: 2nd layer solutions. A second layer is simply a smaller network that runs on top of the base network of the blockchain and alleviates some of its work.
Ethereum’s scaling path is to develop a type of 2nd layer (L2) solutions called rollups. Rollups take the computation part of transactions away from the base (or L1) layer but relies on L1 for security. As a result, rollups can execute more transactions per second while still retaining the consensus and security model of Ethereum.
Rollups = the execution layer
So how do rollups work? Rollups are small networks of powerful nodes that execute transactions very fast and only use the base layer to post a compressed version of the transaction data. Going back the our road analogy, imagine rollups as high-speed subways that take passengers from one point of the city to another. Faster transit times, less individual cars, and hence less crowded roads. Where the analogy breaks is that rollups are much better than subways: they’re like subways that you could onboard anywhere and that go directly where you want to. Magic subways!
Rollups can work with very few nodes on the L2 network, sometimes even only one. Why? Because they rely on the L1 for consensus and security. In other words, the base layer — the one with hundred of thousand of nodes, is still the one deciding whether the rollup was honest or not.
This scaling path specializes each layer. The base layer becomes the consensus and security layer. The second layer becomes the execution layer. Thus the base layer can stay decentralized and the second layer can provide faster transaction execution, hence more transactions per second. And because Ethereum is a smart contracts chain, execution is the main bottleneck.
To summarize, a rollup takes a batch of transactions, execute them on its own machines, and sends back some data to the base layer to update its state. And then moves on to the next batch, and so on.
Optimistic vs ZK rollups
There are two types of rollups, depending on how what data they publish on the base layer.
Optimistic rollups compress each transaction data (by a factor 10 or so) and publish this data along with the result of the execution of the transaction to the base layer. For example, the change in account balances resulting from the execution of a smart contract.
Optimistic rollups are the “trust me!” type. They don’t try to prove that they’re honest. However, if anyone disagrees with their published result, they can challenge the rollup by staking some ether and publishing a fraud proof. A set of smart contracts running on the base layer will decide who is right, and the party proven wrong will have his stake slashed.
Because of this mechanism, optimistic rollups allow for a predetermined “challenge” period of time after each publication, usually about a week. Practically, this means that users who have moved their funds from the base layer to an optimistic rollup may have to wait a few days when they want to withdraw their funds.
This is obviously a pain point, but it is being addressed by third parties that can guarantee the result of an optimistic rollup immediately for some fee, and hence allow immediate withdrawals. Things are still evolving quickly so I expect this issue will be addressed in the near future.
The main attraction of optimistic rollups is their simplicity for developpers. They can be made fully compatible with the EVM, Ethereum’s Virtual Machine, and hence run existing smart contracts with little to no modifications. That has allowed a company like Arbitrum, for example, to launch in 2021 with a whole ecosystem of applications from third parties, from Aave to Uniswap and many more right from the start.
Optimism, the other big optimistic rollup available this year, also supports a bunch of decentralized apps and wallets.
The second type of rollups is ZK rollups. ZK stand for Zero Knowledge. It’s an advanced branch of cryptography that aims to prove that some things are correct without revealing their content (hence the name). ZK rollups don’t publish the data of each individual transaction to the base layer. Instead, they publish a cryptographic proof — called a validity proof — that the new state of the rollup is indeed the result of the batch of transactions processed.
Producing a validity proof require complex computation, but the result is very small and easy enough to be “checked” by regular nodes of the base layer. This means that, unlike with optimistic rollups, there is no “wait time” to contest a batch of transactions. If the validity proof is correct, then it is accepted on the base layer and the rollup moves forward. In particular, users can withdraw their funds from the rollup to the base layer without any delay.
The flipside is that, because of their mathematical complexity, ZK rollups tend to be more application specific than optimistic rollups. Deploying a specific DeFi app on a ZK rollup involves more work than deploying on an optimistic one. ZK rollups do not offer, at least at this point in time, a general solution for all smart contracts.
What does it mean for users?
So how does this all impact end-users of Ethereum? Rollups provide a faster, cheaper experience for users. Cheaper because the fees from layer 1 are spread across many transactions, and faster because a rollup can guarantee the result of a transaction even before publishing the current batch to the base layer.
This is why applications using either kind of rollups have seen their activity increase dramatically in the past months. The Total Value Locked (TLV) in Arbitrum, for example, has gone from $60m at the beginning of September 2021 to $2.6B two months later! Similarly, dYdX, a decentralized crypto exchange powered by the ZK solution StarkWare has seen its TLV go from $300m to $1B in the same period.
The future : compounding shards with rollups
Looking to the future, another component of Ethereum’s scalability path is sharding. Sharding is the process of splitting the verification task to several subsets of nodes, each subset verifying only a part of the transactions. This process would allow many more transactions to be verified in the same time without increasing the requirements for individual nodes, but it is complex to implement correctly without compromising security. Basically, it’s parallel processing for blockchains.
Ethereum is supposed to implement a 64-shards solution some time around 2023, after the move to proof-of-stake (early 2022?). This would multiply the data available on the base layer by a similar factor, hence providing even more space for rollups to use and thus scaling Ethereum even further!
In other words, the effects of sharding and rollups are designed to compound on each other, propulsing Ethereum into very high TPS territory: in theory somewhere between 10,000 to 100,000 transactions per second (from the current 15 tps without the rollups). Of course, this relies on both the merge (the move to Proof-of-Stake) and sharding happening without any major issues. It’s only a path at this point, but a very hopeful one.