How to Implement a High-Performance Decentralized System

10 min readOct 15, 2021

Current Situation

The future world of interconnectivity, from Web 3.0 to the Metaverse, will evolve around one goal: system decentralization. System decentralization is derived from blockchain but is not limited to the blockchain. Blockchain only realizes decentralized transactions, while complete system decentralization will include: decentralized transactions, decentralized storage, decentralized content distribution, and decentralized computing. Based on this completely decentralized system, through the development of standard access and extension interfaces, we will be able to build an open, unlicensed, and decentralized network that does not require a trusted third party, namely Web 3.0.

“Performance” is currently the most fundamental obstruction to the transition of system decentralization from concept to practice. Due to the decentralized nature of blockchain, any blockchain project that may be successful must be global rather than local, meaning that project developers or operators cannot constrain the project to a local scope to reduce performance requirements. Therefore, compared with centralized systems, the performance requirements for decentralized systems are higher. However, in practice the performance of a decentralized system is far lower than that of a centralized system; not only slightly worse but by several orders of magnitude.

The problem of “performance” in decentralized systems comes from the constructive concept of “decentralization.” A decentralized system in its fullest sense needs to solve two problems, and these two issues form the basic premise of system decentralization:

Trust issue: nodes are untrustworthy, and any node may cheat, but the result needs to be trustworthy. The trustworthiness of the result means that the result is unforgeable, undeniable, and verifiable.
Unlicensed issue: “unlicensed” means that nodes may enter or exit at any time, so the network is not fixed and will always change dynamically. Not using licenses also makes the system unable to enforce any requirements on the performance of the nodes in the network. It also prevents it from placing any constraints on the online time of the nodes.

The two premises above are the root causes of all the problems of decentralization, as well as the appeal of system decentralization. Achieving high performance based on the above two premises is the core of the decentralized system that subverts the existing Internet structure.

Imagine using a decentralized system, connecting global computing, storage, and bandwidth devices, forming a huge resource pool that is jointly built and shared by people all over the world and is not controlled by a single organization. This resource pool has unparalleled fault tolerance and censorship-resistance functions. It does not require a trusted third party, so any person or organization can carry out value transfer on this network to achieve a commercial closed loop. This not only shortens the transaction chain but also eliminates the possibility of third-party organizations misbehaving. Such a decentralized network will severely impact and damage existing Internet applications and business logic, while at the same time protecting the rights of users.

Current Solutions and Existing Problems

In a decentralized system, the most basic part is the decentralized transaction, that is, the basic blockchain. The most discussed performance problem in the blockchain industry is the transaction performance problem of blockchain. Due to the existence of smart contracts, blockchain transactions have a broader scope than traditional bank transfer transactions. For example, on a content publishing/purchasing website, content publishing, purchasing, payment, and downloading are all transactions, so the system transaction performance requirements are higher. Take YouTube as an example — YouTube viewers watch more than 1 billion hours of videos on the platform every day, generating hundreds of millions of views (YouTube, 2021). Calculated based on an average of 10 minutes per video, the number of views per day is 5 billion, and each view is a transaction record. Its TPS demand is a combined 57.87K TPS, while Bitcoin is currently 7 TPS and Ethereum is about 15 TPS, making them more than 3000 times worse than the performance requirements that can support YouTube applications.

The industry currently offers a variety of solutions for the implementation of TPS and to realize blockchain with high TPS performance. They are generally divided into three categories, which are called Layer X extensions in the industry:

Layer 0 Extension:

Layer 0 is a block-level extension. Since N is the number of transactions in the block and D is the interval time of the block, the TPS can be increased by increasing N or decreasing D. However, the shortcomings of this method are very obvious: increasing N means increasing the block size and shortening D means shortening the transmission time. The combination of the two places high demands on the system’s equipment and network:

1. Improved CPU performance requirements: more signatures need to be verified in a shorter time;
2. Improved network performance requirements: larger blocks need to be transmitted in a shorter time.

After these two performance requirements are increased to a certain level, the nodes require server-level CPUs. G bandwidth and ms-level delays are required between nodes, which makes it possible to meet the design requirements only if these nodes are placed in a computer room, resulting in the system losing the ability to decentralize. Such a blockchain system is jokingly called a computer room chain, implying that it is a centralized system dressed up as a blockchain.

Layer 1 Extension:

Layer 1 extension is an architecture-level extension. Its core concept is to group nodes in the system so that each group of nodes can process different transaction information. In this way, the transaction performance can be greatly expanded without affecting the degree of decentralization. Furthermore, the transaction performance of the system can expand as the number of nodes in the system increases.

Layer 1 is a very promising blockchain extension solution in the industry, but there are a few urgent problems that need to be addressed:

1. Sharding security issue: after sharding, the hashrate of the system also goes through sharding. In a system with N shards, the hashrate of each shard is 1/N of the total network. Previously, 51% of the hashrate of the entire network was required to implement an attack. After sharding, attacks can be implemented with only 0.51/N of the entire network. This type of attack is very easy to implement and has a very low cost, so a suitable algorithm is required to solve this security problem.

2. The cross-shard transaction bandwidth issue: when the system has N shards, the possible external paths for each shard are N-1, so the possible paths of the entire network are Nx(N-1), which means that if 10 shards require 1M bandwidth, 1000 shards require 1x100x99=9.9G bandwidth. That is to say, the bandwidth required for cross-shard transactions is proportional to the square of the number of shards, which will limit the number of shards to a small value range.

The above two problems make it extremely difficult to implement a sharded system. Completing this complex system process requires technological breakthroughs and integration.

Layer 2 Extension:

Layer 2 extension is an out-of-system extension. Explicitly, it puts a large number of transactions outside of the chain (another chain or another system), and then feeds the results of the execution back to the original main chain.

Layer 2 extension faces one of two problems: if there is another chain outside of the chain, the other chain also faces TPS performance issues; if there is not a chain outside of the chain but an organization or server, then there is the problem of system centralization.

In addition, Layer 2 also has the problem of how to trust the main chain and extensions. That is, how transaction results placed outside the chain are to be automatically confirmed by the main chain. Those who offer solutions to this problem are divided into two groups: pessimists and optimists. Pessimists believe that every node may be faked, so proof that the data is correct needs to be attached when result state data is fed back to the main chain. In order to reduce the size of this proof, zero-knowledge proof can be used. Optimists say that most nodes will not cheat, so the status data of the results can be used directly. Each piece of transaction data outside the chain is written to the main chain as data to prevent cheating.

Layer 2 extension has no practical significance. Firstly, the plan tries to divert and circumvent the problem, rather than to actually solve it. Secondly, in implementation, the pessimistic solution does not provide a meaningful proof plan, and the optimistic solution is essentially to transfer the transaction data on the chain to the smart contract, which still does not solve the actual problem.

Design Plan to Improve Decentralized Performance

The question remains: is the performance of decentralization impossible to improve? Are decentralization and scalability dilemmas that cannot be solved? Actually, not at all. We can broaden the concept of decentralization through the following design plans to achieve a highly efficient decentralized system:

Sharded Calculations Increase TPS

The implementation of the sharding principle in a narrow sense is the above-mentioned sharding system. In short, nodes and tasks in the system are grouped, and different groups of tasks are executed by different node groups so the system can process them in parallel at the same time. As long as the issues of hash rate security and cross-shard transaction bandwidth are solved, tens of thousands of shards can increase the system’s TPS performance to thousands of times that of a single packet.

Broadly speaking, if some calculations are indispensable, but the value is not high, even if they are put in a single shard and executed by each node, the cost is far greater than the benefit. In this case, a separation of labor in the wider sense must be considered: select certain nodes for calculation, and then return the results. If the system can ensure that the results are correct through a certain method, then this kind of plan is acceptable. This plan is rather similar to the Layer 2 extension, the difference being that it only requires off-chain nodes to perform a certain calculation, not to improve TPS performance but to reduce calculation costs. This leads us to the second part of the plan:

Off-Chain Computing Reduces Costs

As mentioned above, costs are reduced through off-chain calculations. If there are 2000 nodes in a shard, and 5 nodes are selected for calculation under the chain, the cost will be reduced to 1/400 of the original. The difficulty of off-chain calculations lies in how to ensure the correct calculation results on the chain. We can use a two-challenge plan to achieve this:

Suppose there are N nodes in the full sharding, and each time M nodes are matched for calculation. After the nodes are calculated, the results are uploaded to the chain, and given a calculation period and challenge period. The system will recognize the data when:

1. More than M/2 nodes upload the same result during the calculation period;
2. There are no other results from the same calculation;
3. And no other nodes challenge this result during the challenge period.

The system will re-match the nodes for calculation or challenge when:
1. The same calculation yields two different results;
2. There are fewer than M/2 nodes submitting results during the calculation period.

Nodes that do not submit results will have a certain fee deducted, and nodes that submit incorrect results will be severely punished. In the implementation process of this plan, there is a very important problem that needs to be solved: the calculated data is stored off-chain. Challenge nodes need to obtain off-chain data to be able to perform calculations and challenges, so we need the third part of the design:

On-Chain and Off-Chain Data Sharing through Storage Pools

In current blockchain systems, only the data on the chain can be shared between different nodes. There is no mechanism for nodes to share off-chain data. This is also the reason why the Layer 2 plan could not be realized even without considering the actual significance of its existence.

IPFS and BZZ provide the concept of a peer-to-peer global file system. That is, forming a global file pool through decentralized nodes. After data is written from one node, it can be read from any node. We can use this global file system as a storage pool for sharing data between nodes. The nodes store the original data in the storage pool, while the hash of the original data and the result of the calculation are written to the chain. When other nodes need to verify the data, they can obtain the data from the storage pool according to the hash value to verify whether the result stored on the chain is correct.

From Joint Calculations to Matching Calculations

In the above implementation, the concept of decentralization as it currently exists has been expanded: a narrow definition of decentralization means that one does not believe in the results of any node calculations, and all calculations need to be verified by oneself. In a broadened concept of decentralization, if each computing node is obtained through a certain algorithm and can be verified and challenged, no node raises doubts about calculation results that are trustworthy after a certain challenge period. Decentralization in the narrow sense obtains results through joint calculations, while decentralization in the broadened sense obtains trusted results through matching calculations. Obviously, when the node used for matching calculation is 1/N of all nodes in the network, the cost of calculation will be reduced to 1/N of the original, and the efficiency will be increased to N times the original. More importantly, if the node data needed for the matching calculation is fixed, then the efficiency of the entire system will increase with the increase in random node data. In other words, it has linear scalability!

In Conclusion

The above sections discuss the methods and strategies for improving the efficiency of decentralized systems and achieving linear scalability of performance. Based on these methods and strategies, we have implemented a high-performance blockchain system (Weber Chain) and a high-performance traffic distribution system (AuroraFS) based on Weber Chain. In subsequent sections, we will first introduce the step-by-step details of implementing AuroraFS and show the effects of implementing it. We will then explain the technical details of implementing Weber Chain and show the effects of implementing it.

Gauss Aurora Lab
Gold Coast, Australia, 2021

How to Implement a High-Performance Decentralized System

Written by AuroraFS