What is Blockchain and how does it work?
What is Blockchain
Blockchain is a distributed and unalterable ledger (or database). It is secured cryptographically, and allows participants who may not know each other to trust a shared record of transactions or events.
The ledger is distributed, with all participants (or nodes) maintaining a copy of the ledger. A transaction or event can be checked against the ledger at any time.
The ledger is maintained without the need for a trusted party or central authority. As such, it also eliminates the risks associated with holding data in a single, centralised location.
The ledgers can be public or private. In the former, anyone can participate, while the latter is restricted to authorised participants only.
How Does Blockchain Work?
The intention here is to illustrate how the general process works, rather than specific instances. This simplified process is described using a cryptographic hash function.
A hash function takes an input of any size and produces an output of a fixed length, such as 256 bits. Every bit of the output block depends upon every bit of the input block, and the output block acts as a representative of the input block. If any bits of the input block are changed then this should cause the output block to change in an unpredictable way.
Since there are more possible inputs than outputs, there will necessarily be different inputs that give the same hash output. However, the output block size is chosen so that the likelihood of this happening is negligible, and a property of cryptographic hash functions is that for an appropriate output block size it is computationally infeasible to find different inputs giving the same output value.
Hash functions are often used to verify the integrity (or correctness) of information. The information is first input to the hash function to obtain a hash value. Provided it is known and trusted, this original hash value can be used at any later point in time to verify the integrity of the information by re-computing the hash value and comparing the result with the known and trusted hash value. The correct input will always give the expected hash value, and so if the two hash values match, then with overwhelming probability the input is correct.
The input to the hash function may come from a single source, or the input block may be constructed from the inputs of a number of participants, or nodes. In the latter case, verification of an input at a later time requires the original hash result and all of the inputs, although this can also be achieved using a hash structure that requires the original hash result plus a combination of actual inputs and values derived from the other inputs.
A hash function input can include the output of the previous iteration of the hash process. This links the current set of inputs to the previous set, and in effect creates a chain of hash output blocks. This chain represents the integrity of every input to every iteration of the process right back to the first iteration, since the characteristics of the hash function ensure that if any bit of any one of the inputs were to change, then this would cause a change in the corresponding output block, and by the chaining effect this would alter every subsequent output block.
In an environment where new inputs are being steadily created, iterations of the hash function also act as a timestamp for the sequence of events - the existence of a hash demonstrates that a set of inputs must have existed at the point in time when the hash was created.
The process for creating a hash uses as hash function (such as SHA256) and a hash process that defines how the hash function is to be used. An example of a hash process is a Merkle tree.
The Blockchain Process
As there is no central authority to co-ordinate the inclusion of new events or transactions, all transactions have to be announced by being broadcast to all participants.
At a specified point in time a designated party (defined by a consensus mechanism) will generate a new block. In essence, this will include a current block hash value derived from all of the new events since the creation of the previous block, along with the hash value of the previous block (to link the blocks together).
A block will consist of both the current block hash value and the hash value of the previous block, together with all of the new events to be included in the block. Since the inputs to the current block hash value are present in the block data structure along with the current block hash value itself, the integrity of the block can be verified.
In practice, there are various ways in which this can be implemented. For example, as in the case of Bitcoin, one way is to first compute a hash of all of the data or transactions to be included in the new block. This hash value is then placed in a block header along with the hash value of the previous block, and both values are input to an additional hash process to create an overall block hash that can also serve as the block identifier. So it is the block hash value that is derived from all of the events together with the previous block value, and so proves the integrity of both the block and the transactions it contains.
Distributing the block chain amongst a large number of nodes ensures that it is not easily corrupted or falsified, as even if a malicious entity is able to construct a chain of blocks that is capable of being verified as legitimate by nodes, it would still have to substitute for the authentic chain at a majority of nodes.
Another line of attack would be to modify a block’s ledger entries such that they hash down to the legitimate block hash value. However, the properties of cryptographic hash functions make finding such a hash input computationally infeasible.
The process is summarised in Figure 1 above. In this image, the spheres represent the nodes, and the cubes represent the computation of a new Blockchain value. The yellow line represents the chain of blocks.
A characteristic of Blockchain is that there is no centralised management, and so there has to be a mechanism to fairly establish a consensus as to which party is the accepted creator of a block, or is tasked with creating a block.
One approach is to use a proof of work. This is based on a mathematical or computational problem of quantifiable difficulty, where the level of difficulty can be set accordingly, and where a solution can be easily verified. Finding a solution therefore requires effort (typically computational), and producing a solution demonstrates that the work has been done. Searching for a solution is known as mining, and since this requires effort there has to be an incentive to take part, such as earning a reward.
The block added to the chain is the first to demonstrate that the work has been done.
An example of a proof-of-work is a function with the property that for any given input the output is unpredictable (as is the case with a hash function), with the objective of finding an output with specific characteristics. This unpredictability of the function should ensure that it is not possible to go directly to the solution without searching, and so the best approach to finding a solution is to keep trying different inputs until one is found that gives the desired output. To allow this, some parts of the input must be variable so as to ensure the availability of many different inputs. This is often achieved in part by using a nonce, which is a value chosen at random and therefore independent of other input data.
The difficulty of such functions is generally based on the expected number of trials before a successful outcome. Since this is derived from the probability of finding particular outputs, it is always possible that a party will be lucky and find a valid output more quickly than expected, therefore not accurately reflecting the expected level of work. Or it may take longer. However, given the probabilities involved, this is not usually too significant.
Another common approach is to use a Proof-of-Stake.
Blockchain operates without a need for a central authority. Updates to the ledger are agreed to and implemented using a consensus process, and anyone can check an event against the ledger.
For the ledger entries to have a value or usefulness it is necessary that they are trusted, authentic and secure.
Individual nodes are responsible for the security of the data they submit to the Blockchain, with security functions being applied at the nodes. These include:
- Ensuring accountability, ownership and control
- Allowing information to be secure and authenticated
- Authorising participation