Path to (proto-)danksharding - Episode I: Ethereum scalability limitations
You may have heard some technical terms such as sharding, danksharding, proto-danksharding, EIP-4844, blobs, merged-market fees, PBS (Proposer Builder Separation), or KZG commitments.
It can be difficult to understand how all these concepts relate to each other.
This post is the first episode in a series of four.
This series aims to explain from scratch the concepts necessary to implement (proto-)danksharding in the Ethereum protocol. We will also discuss why Ethereum needs it and how the Ethereum community aims to develop this solution.
- Episode I: Ethereum scalability limitations (you are here!)
- Episode II: Optimistic rollups
- Episode III: ZK rollups
- Episode IV: (Proto-)Danksharding
This first post explains the basics of Ethereum, such as how transactions are added to the blockchain, what gas is, how gas limits work, calculating gas prices, and why we need all these concepts.
As you may know, the Ethereum Network is composed of thousands of nodes. Each node is basically a computer (with a given amount of CPU and memory) and a hard drive (with a given amount of storage), containing a database: the blockchain.
An Ethereum block is a set of data that contains a group of transactions. Transactions can be either standard Ether transfers or smart contract executions.
A standard Ether transfer is a simple transfer of Ether from one Ethereum account to another, for example, when Myriam wants to send 1 ETH to Camille.
A smart contract is a program containing arbitrarily complex operations (like computing the square root of an input and returning the result) that can be executed by any Ethereum node on the Ethereum Virtual Machine (EVM). Unlike traditional programs, all nodes of the Ethereum network must run EVM.
In addition to the transactions, the block includes extra data, such as the "world state," which is useful for the operation of Ethereum as a whole. The figure below represents an Ethereum block. The transactions are shown as coloured rectangles on the left, and the extra data for the operation of Ethereum is colored gray on the right.
Blocks are chained together to form a blockchain (a chain of blocks).
Finally, the Ethereum Network we saw above can be represented as follows:
Each Ethereum node includes a set of transactions that have been broadcast by users but have not yet been included in any block. This set of transactions is known as the Ethereum mempool.
The journey of an Ethereum transaction
When a user wants to add a new transaction on the Ethereum blockchain, they first need to create and sign the transaction.
Then the user submits their transaction to the node of the Ethereum network they are connected to. The node adds this transaction to its own mempool.
The node broadcasts the pending transaction to other nodes of the network, ensuring that all Ethereum nodes eventually become aware of it.
Eventually, one node in the network is chosen to propose a new block. This node selects some pending transactions from its mempool and includes them in the newly created block. (The bottom node in the following figure)
The block proposer then “gossips” (sends) the newly added block to another node of the network.
This view of the Ethereum network is a bit oversimplified. In reality, the Ethereum network contains two P2P networks: the execution chain and the beacon chain. However, we will stick with this view of the Ethereum network in this blog post for simplicity’s sake.
Do you want some gas?
Every 12 seconds, each node in the Ethereum network must execute all transactions, including those of smart contracts.
Let's assume that a block could contain an unlimited number of transactions and that there are 5 computation-heavy smart contract executions pending. The block proposer will include these executions in the next block, and every node in the Ethereum network will have to process them.
In this case, 12 seconds alone are not sufficient to execute all the smart contracts. It's worth mentioning that the 12-second block time is not solely dedicated to transaction executions but also to networking tasks such as gossipping the block to other nodes in the network.
⇒ We need a way to limit the amount of computational power required by all transactions in one block. This will ensure that every node on the Ethereum network has enough time to execute them within the allotted time.
Unit of gas
The Ethereum community ended up with the concept of the unit of gas.
Each operation (also called OPCODE) in a smart contract costs a given amount of gas units.
For instance, adding two number a + b sequences requires 3 units of gas, multiplying two numbers a * b requires 5 units of gas, and comparing two numbers between them, a < b, requires 3 units of gas.
The complete table can be found here, and an extract of it is below.
The more CPU-intensive an operation is, the more gas units it requires. Similarly, storing data on the blockchain is expensive in terms of gas usage.
EIP-1559 defines that the maximum amount of gas that can be included in one block is 30,000,000 units. This limit has been set to ensure that every node on the Ethereum network can handle the workload within the allotted time, without requiring the node to be an expensive, last-generation server.
We’ll dedicate a full blog post to EIP-1559 later.
Thanks to this 30,000,000 gas unit limit, we can now limit the amount of required computation in each block.
However, the big issue is that there is no incentive for an Ethereum user not to spam the network with a huge number of transactions. As a result, the user needs to pay (even a small amount) for each transaction they want to send to the network to prevent spamming.
Each gas unit has a price in Ether (or more generally in GWei, where 1 ETH = 1 000 000 000 GWei). Gas prices vary according to what’s defined in EIP-1559. As a rule of thumb, the gas price increases as the Ethereum network’s usage increases.
You can find the evolution of gas prices on Ultrasound Money, with an extract below.
Let's produce a quick numerical example:
A user wants to execute a smart contract function that involves many operations, including additions, multiplications, storage, and exponentiations. The total gas cost of all these operations amounts to 120,000 gas units.
Currently, the gas unit cost is 39 Gwei. Therefore, the cost of executing this particular smart contract is 120,000 * 39 = 4,680,000 Gwei = 0.00468 ETH.
⇒ The user will need to spend 0.00468 ETH to execute this transaction.
Assuming an ETH price of $1500, the gas cost to the user will be approximately $7.
This demonstrates the importance of writing effective smart contracts!
For reference, filling a block of 30,000,000 units of gas at a gas price of 39 Gwei (assuming an ETH price of $1500), would cost 1.17 ETH ($1755 USD).
We now have a system with limits on computational power and storage set for each block. Running a transaction incurs a certain cost to the user. However, a block can contain only up to 30,000,000 units of gas per block, which may not be sufficient under certain circumstances.
The scalability issue
Ethereum has a scalability issue. Currently, the Ethereum network is able to handle a maximum of 119 TPS (Transactions Per Second), which is clearly insufficient for global adoption.
Where does this 119 TPS figure come from?
- A standard ETH transfer requires a gas limit of 21,000 units of gas.
- Since EIP-1559, each block can contain a maximum of 30,000,000 units of gas.
- Since The Merge, a new block can be produced every 12 seconds.
We can deduce the following formula, where we compute the theoretical maximum number of TPS.
However, in reality, the number of TPS the Ethereum network is able to handle is far lower, for at least two reasons.
- The first one is, in this computation, we assumed all blocks contain only standard ETH transfers, and not more complex CPU tasks (and thus more gas intensive), like executing smart contracts.
- The second one is related to how gas prices work. Since EIP-1559, each block has a target limit of 15,000,000 gas units. If a block contains more than this amount of gas, the gas price will increase. In extreme cases, if all blocks contain 30,000,000 units of gas, the gas price of a given block will cost 12.5% more than the gas price of the previous block. This implies the price of transactions will increase exponentially over the time, which is not sustainable.
This reasoning leads us to the following question: Why would we not simply increase the maximum gas limit per block, or why would we not simply decrease the block time?
After all, doing so would be a very simple path to increase the number of TPS. Unfortunately, the reality is not so simple.
The blockchain trilemma refers to the three main challenges that a blockchain network faces: scalability, security, and decentralization. These three goals are difficult to achieve simultaneously: any blockchain must make trade-offs between them to function effectively.
- Decentralization: A blockchain must be decentralized, meaning that it is not controlled by a single entity or group. Decentralization is important because it ensures that the network is not subject to the whims of a single entity and that it remains transparent and accountable.
- Security: A blockchain must be secure, meaning that it must be resistant to attacks from hackers and other malicious actors. To achieve this, a blockchain must use advanced encryption techniques, consensus algorithms, and other security measures.
- Scalability: A blockchain must be able to handle a large number of transactions in a fast and efficient manner. However, increasing the number of transactions can lead to scalability issues, more transactions means more data to process and store.
The Ethereum community has chosen to optimise for the first two points: decentralization and security, at the expense of scalability as their primary values.
If we want to increase the number of TPS the Ethereum network can run, a simple solution would be to either:
- Increase the gas limit per block from 30,000,000 units to a higher value, and/or
- Decrease the block time from 12 seconds to a lower value.
This proposed solution would work, but it would force the Ethereum nodes to execute transactions faster in order to not exceed the allotted block time. This, in turn, would require the Ethereum virtual machine to run on faster and more powerful nodes, which are typically more expensive and less affordable to the general public. This could lead to an increase in centralization, which goes against the two first goals of the trilemma: decentralization and security. Therefore, it is necessary to find another solution to improve scalability without compromising these values.
Several solutions have been explored by the Ethereum community in the past, including channels, side-chains, validiums, and rollups. Recently, rollups have received the most industry attention.
The next blog post will focus on optimistic rollups.
Thanks to Emmanuel NALEPA (@manunalepa) for writing this article.
Kiln is the leading enterprise-grade staking platform, enabling institutional customers to stake assets, and whitelabel staking functionality into their offerings. Our platform is API-first and enables fully automated validators, rewards, and commission management.