Back to Series

What is blockchain technology?

What is database sharding?

Database sharding is a technique that separates a single blockchain, into multiple smaller blockchains (or shards). Could Sharding address the network latency and bandwidth problem associated with scalability?

Download Series PDF

Database sharding is seen as one of the most prominent ways to scale blockchains and is currently being developed by projects such as Ethereum, Ziliqa, Polkadot and Nero.

A key argument from computer science is that you can’t scale a system without losing security, privacy (decentralisation), or both. This trilemma shows how complicated it is to properly distribute a system among independent peers who can join and leave the network as they please.  Communication between nodes, or network latency, could become incredibly slow given the fact nodes can choose to randomly disconnect at any point in time – without consequences. Network bandwidth plays a key role in the overall system’s performance, which can be a significant problem for many cryptocurrencies as most people in the world don’t possess enough resources to purchase large amounts of data.

Is there a way to address both the network latency and bandwidth problem?

Come Sharding

Developers have proposed many solutions to address the issue of throughput on the protocol level. These solutions can be mostly separated into those that delegate all the computation to a small set of powerful nodes, and those that have each node in the network only doing a subset of the total amount of work. Database sharding, nowadays most commonly known simply as sharding, is a technique that allows nodes to process only small parts of the entire blockchain transactions. At the same time, it makes sure the state of the whole chain is correctly ordered and validated. Sharding creates multiple smaller databases that only store local copies of transactions. Each database validates and stores a small part, making the system lighter as a whole.

How does database sharding work?

We need to start by looking at what the actual role of nodes is within a blockchain. Nodes perform three core tasks:

A. Process transactions

B. Relay validated transactions and completed blocks to other nodes

C. Store the state and the history of the entire network ledger

Each of these three tasks imposes a growing requirement on the nodes operating the network.

I. The necessity to process transactions requires more compute power with the increased number of transactions being processed.

II. The necessity to relay transactions and blocks requires more network bandwidth with the increased number of transactions being relayed.

III. The necessity to store data requires more storage as the state grows. Importantly, unlike the processing power and network, the storage requirement grows even if the transaction rate (number of transactions processed per second) remains constant.

Under State Sharding, the nodes in each shard build their own blockchain and contain transactions that affect only the local part assigned to that shard. The validators only need to relay transactions that affect their part of the state. This partition linearly reduces the requirement on all compute power, storage, and network bandwidth. However, it introduces new problems such as data availability and cross-shard transactions.

An overview of different Sharding versions

There are two major versions of sharding being used.

  1. Partitioned sharding, where shards don’t communicate with each other directly through a central relay
  2. State sharding, where shards communicate with each other through a state, or central, relay

 

Each type of database sharding has its own benefits and drawbacks, described in the table below.

Partitioned State

Properties

Independent shards, own validators, no coordinator required

Quadratic sharding capabilities, cross-shard communication, coordinator required

Benefits

Smaller chains mean faster communication and sync times within nodes, blockchain size decreases exponentially

Smaller, linked chains means faster communication, sync times between shards chain size decreases, imcreased interoperability

Risks

As each shard has its own validators, each shard is less secure, routing issues between nodes

Data availability is reduced as each shard needs to be online to relay information, less security than a one-chain solution

Example BeansTalk

Beacon Chain

Conclusion

Database sharding is a technique that separates a single blockchain into multiple smaller blockchains (or shards). Each shard runs independently and processes its own transactions. Improvements upon how sharding works are being tested, such as cross-sharding, which allows shards to communicate in-between themselves. Some of the key benefits of sharding are the reduction of the overall blockchain size and the possibility for a quadratic network performance enhancement.

What next?

If you want to know more about database sharding and the blockchain, access our definitive guide!

 

 

Recent Guides

A guide to open-governance models

With the rise of cryptocurrencies, new governance models have emerged. Although some cryptocurrency enthusiasts, academics and FinTech proponents argue regulation could bring some benefits to the cryptocurrency space, other...

What is HIVE blockchain?

Blockchain technology is disrupting industries across the world – but with so many companies utilising the technology it can be difficult to understand them all. HIVE blockchain was previously  known...