Blockchain technology is conceptually rather complicated. It is, however, based on basic, known technologies such as peer-to-peer networks and distributed ledger.
At its core, blockchain is a distributed ledger technology for recording transactions between two or more parties. It’s been used primarily to support cryptocurrencies, but that’s changing as other uses, such as data storage, are emerging.
When a blockchain approach is combined with a peer-to-peer (P2P) network, a pool of distributed storage resources is created, providing the nodes for the blockchain storage. The beauty of blockchain is it’s decentralized and completely shared. No one entity owns it or controls it.
What follows is blockchain terminology you’ll need to know to understand blockchain storage and how it works.
Peer-to-peer distributed network technology
Knowing how a P2P network works is key to understanding blockchain terminology. P2P is a decentralized communications model in which all parties have equal ability to initiate communication and function as both a client and a server on the network. Computers on the network serve as file sharing nodes, storing files and acting as a server for those files. All computers on a P2P network can access files stored on other computers on the network.
Paired with blockchain technology, a P2P network can be used to enable a group of organizations to share storage. Each organization functions as a node on the network, offering and consuming storage resources.
Distributed ledger technology
Blockchain technology uses a distributed ledger to maintain the details of each transaction in what is essentially a decentralized database. Distributed ledger technology (DLT) records details about asset transactions in multiple places at the same time. A distributed ledger doesn’t have a central data store or administrator.
Each DLT node in a blockchain system processes and verifies every transaction, generating a record of each item and creating consensus on the veracity of each item. The ledger includes details about individual transactions, such as the shard location and hash and leasing costs. A copy of the ledger is stored on every node in the blockchain network. The ledger is transparent, verifiable, traceable and tamper-proof.
A block has a specific meaning in blockchain terminology. Transactions are packaged into blocks, and the blocks are chained together and sent out to the network nodes.
More specifically, transactions get added to the distributed ledger chronologically and are stored as a series of blocks. Each block references the preceding block to form an interconnected chain. The first block has a header and data that pertains to the transaction itself. The block’s timestamp is used to help create an alphanumeric string, or hash. Each subsequent block in the ledger uses the previous block’s hash to create its own hash.
When a new block is added to the chain, a validation and consensus process is used among all the nodes on the network to verify its authenticity. Essentially, the consensus process takes a vote among the nodes; a majority of nodes in the network must verify that the new block’s hash has been calculated correctly.
Once added to the network, a block can be referenced by subsequent blocks, but it can’t be changed. Any change to a block will affect the hashes for the previous and subsequent blocks and disrupt the ledger’s shared state. Consensus among the nodes won’t be possible if a block is modified, and no new blocks will be added until the issue is solved by discarding the problem block and redoing the consensus process.
Data in a blockchain system is broken into redundant pieces that are stored across multiple nodes in the network. As a result, potential attackers must breach several machines, rather than just one, to access the data. The decentralized nodes enable blockchain technology to deliver more reliable, resilient and economical storage than a centralized cloud.
In blockchain terminology, a shard is what happens when a blockchain system breaks the data it stores into smaller segments. The goal of this sharding process is to create manageable chunks of data that can be distributed across multiple nodes. How the sharding is done depends on the data itself and the application running the process. Sharding a relational database isn’t done the same way as sharding a NoSQL database or files in a file share, for example.
Once it creates the shards, the blockchain storage system generates an alphanumeric output string or cryptographic hash. The output string has a fixed length and is based on the data or encryption keys connected to the shard. The system places the hash in the ledger and shard metadata so transactions can be linked to the shards. The exact way systems generate hashes varies among systems as well.
A farmer in blockchain terminology refers to the organizations and individuals that own the storage nodes and rent out excess storage capacity. They can be DevOps professionals with extra capacity in their data centers or people with excess hard drive space on their computers. People or organizations in need of storage can get access to this excess capacity through the blockchain network in exchange for crypto payments.
The blockchain architecture ensures no single entity owns all the storage resources or has access to or controls the entire storage infrastructure. Content owners are the only ones who have full access to all their own data across the various nodes where the data resides.