Blockchain is not another database

Because it is not supposed to, by architecture.

What makes blockchain different from database? Is blockchain just a shared database? Those are perhaps the most common questions when it comes to discussions around the table. As though the confusion has not been lousy enough, recently many even started referring blockchain to the term “distributed ledger” in order to avoid linkage to cryptocurrency hype. This definition led to dangerous misconceptions about the technology and failed to justify its purposes.

This article explains clearly why blockchain is not invented to be another storage technology by disclosing misfits in its architecture.

Cost of replicating data on every device in the network

Storage inefficiency

As we have known, blockchain technology is built upon a peer to peer network where data is replicated on every single node.

If blockchain was to be served as a shared database, it would mean zillion bytes of data were supposed to be replicated and stored all over the world on every single machine that participated in the blockchain network. Wouldn’t it be a huge waste of resource and effort ? Not to mention it would be merely insane to do so by looking into the immense data we create and consume everyday.

There are more than 2.5 quintillion (2.5 x 1018) bytes of data produced every day. If blockchain was a database, you would want to store as much data as possible on the chain. But only a fraction of data we produce would already blast out any storage device. The reason we have data centers is because of efficiency and perhaps not everyone is ready to carry around petabytes of data in their pocket.

Processing inefficiency

Some would argue that storage is cheap with the advancement of technology. That’s true, however the processing power of blockchain’s execution machines such as EVM (Ethereum virtual machine) are still light years behind existing databases. Not to mention any blockchain platforms make use of existing database technologies to persist data under the hood. For example most Ethereum implementations use levelDb, Hyperledger Fabric uses either CouchDB or RocksDB.

Blockchain platforms are simply not optimized for data storage. Even with future improvements and breakthroughs, it is unlikely that developer community would want to bring it to compete with databases. Because why reinvent the wheel when they could instead integrate them in a modular architecture where one technology can be used to leverage capacity of another. Hyperledger Fabric is an example of such plug and play system, where you can choose to use CouchDB or RocksDB to store blockchain state.

Barrier to entry

For devices to take part in the chain, they have to sync all data first hand. At the current state it already takes half a day with decent internet connection to light sync Ethereum. If people started to dump any things on blockchain as they would do with a database, blockchain-as-storage would take new participants days to download all data before they can make any good out of it. That clearly makes a rough barrier to entry.

Unscalability

Blockchain is made for operations in trustless environment. Apparently Internet-of-Things (IoT) is a realm where blockchain will be of great use. There is zero level of trust between IoT devices but their digital footprints can be used as proofs of authenticity. With the help of blockchain technology to verify information rectitude and provenance, devices can securely exchange data. Moreover, devices can operate seamlessly together because operations can be automated by using smart contracts.

If blockchain is ever scaled to IoT, it must fit into restricted resources of embedded system. Apparently blockchain-as-storage does not fit into category.

What is blockchain good for

Ensuring data integrity

Blockchain is built upon an architecture where data is kept securely immutable. This is also one of main concerns since its creation. The peer to peer network ensures data is replicated all over places, consensus algorithms warrant a single version of truth upon replicated versions, and the cryptographic file structure makes sure data cannot be altered.

Clearly data safeguard is at the center of blockchain technology, then why is it unsuitable for mass data storage ? The reason is that data integrity is only a medium but not the purpose of blockchain. It is a means to reliably execute transactions between tiers and carry out business or governance activities without need for entrusted intermediaries.

With data integrity, both veracity and authenticity of information is conserved.

Ensuring reliable automation

By using smart contracts on blockchain, we can achieve true automation where devices can reliably perform together.

Take for example in governance where most of paper job is repetitive and carried out by human labor. If administrative procedures and forms can be translated into smart contracts, execution can be automatic, fast and reliable. Smart contracts eliminate most, if not all, haphazardness of errors caused by obscure regulatory texts. Biometric smart cards can replace ID card to approve citizenship and replace signature for government forms. This is not science fiction but already a practice underway in Estonian government.

Another example from energy sector, where smart contracts can warrant flow of data and trigger operations in automatic sensors in order to plan energy redistribution more efficiently.

Conclusion

Data storage is undeniably necessary for blockchain to function. There is a minimum amount of data you would want to store on blockchain in order for it to be useful. However using it as a storage technology is plainly inefficient. Blockchain is clearly good at validating integrity of data but it is not built for processing massive amount of data.

Blockchain can be used in a modular system to take care of data entry validation and trigger consecutive executions.

Leave a Reply

Your email address will not be published. Required fields are marked *