Swarm is one of the latest projects to be built on Ethereum and is perhaps the central piece of the entire decentralised ecosystem.
According to its website, Swarm is a censorship-resistant, permissionless, decentralised storage and communication infrastructure.
The main purpose of Swarm is to be a decentralised store for dApp code, user data, blockchain data, and state data.
Swarm sets out to provide various base layer services for Web 3.0. Services include node-to-node messaging, media streaming, decentralised database services, and scalable state-channel infrastructure for decentralised service economies.
An intro to Swarm
Before I dive deeper into Swarm’s technical structure, I should define how data is stored in this decentralised file-storing system.
Swarm’s base-layer infrastructure provides the services mentioned above by allowing each service to contribute resources to each other.
These contributions are accurately accounted for on a peer-to-peer basis, allowing nodes to trade resource for resource while offering monetary compensation to nodes consuming less than they serve.
Swarm is using existing smart contract platforms like Ethereum to implement its incentives model.
There are three main components that make up the Swarm decentralised storage system:
- Chunks: These are pieces of data of limited size (max 4K) that act as the basic unit of storage and retrieval in Swarm. Chunks are linked to addresses.
- Reference: This is a unique identifier of a file that allows clients to retrieve and access the content.
- Manifest: This is a data structure describing file collections. It specifies paths and corresponding content hashes allowing for URL-based content retrieval.
The image above shows how a request is rendered through Swarm. Chunks are represented by hashed information such as page.html or page.css.
Each chunk contains a reference that is in the Manifest, telling the requester how to retrieve and render the information.
Next, let’s take a look at Swarm’s architecture and how data is uploaded and written to different nodes.
Swarm’s stack and architecture
The image below shows how the upload process takes place and how the information is stored in a decentralised manner.
After a blob is received by the Swarm node, it will split said blob into minor and equal chunks of data, then distribute said chunks among different nodes that will automatically sync the data according to each chunk’s timestamp.
The DPA, or distributed pre-image archive, will choose which nodes get to store which chunks.
Finally, each bin (0, 1, … , 31) shows how nodes on the same address-space will store related chunks.
The actual storage layer of Swarm consists of two main components: the LocalStore and the NetStore. The LocalStore is composed of an in-memory fast cache (Memstore) and a persistent disk storage (DBStore). The NetStore extends the LocalStore to a distributed storage of Swarm and implements the DPA.
The FileStore is the local interface for storage and retrieval of files. When a file is handed to the FileStore for storage, it chunks the document into a Merkle hash tree and hands its root key back to the caller. This key can later be used to retrieve the document in question in part or whole.
Finally, the FileStore takes the Swarm hash and uses the NetStore to retrieve the root chunk of the document for the user.
From the end user’s perspective, Swarm does not affect navigation or behaviour.
In the background, the difference is that content is hosted on a peer-to-peer storage network instead of individual servers. This peer-to-peer network is self-sustaining due to a built-in incentive system. Incentives are only possible due to the use of a public blockchain that allows trading resources for payment.
Swarm is designed to deeply integrate with the DevP2P multi-protocol network layer of Ethereum as well as with the Ethereum blockchain for domain name resolution (ENS), service payments, and content availability insurance.
Swarm vs IPFS vs Filecoin
To conclude this piece, I would like to underline the key differences between Swarm and other distributed filestores such as IPFS and Filecoin:
- Swarm’s core storage component is an immutable content address rather than a generic DHT (IPFS uses DHT).
- Swarm, Filecoin, and IPFS use different network communication layers and peer management protocols.
- Swarm has deep integration with the Ethereum blockchain and the incentive system benefits from both smart contracts as well as the semi-stable peer-pool. Filecoin uses proof of retrievability as part of mining. IPFS has no incentive mechanism built in.