In many blockchain application scenarios, it is necessary to use a decentralized storage system, such as IPFS, to store files, and then use the blockchain as evidence. Due to our business requirement, our company, BSOS, needs to build a high-availability Quorum consortium blockchain and use IPFS for data storage; however, the actual measurement in our company found that if the node owner does not achieve redundancy, this will cause system instability. When the Quorum node or IPFS is down, the system will have down time. In order to ensure the availability of the system as high as 99.9% and achieve redundant backup, it is necessary to build a High Availability Quorum node and an IPFS Cluster. Our company's products are designed around cloud native and deployed on k8s, so of course Quorum and IPFS Cluster must be set up on k8s. The code I introduced below will be published in BSOS repo. If you have any questions, please send an issue directly to discuss with us.
The following will be divided into two parts, Quorum and IPFS Cluster deployment
Quorum HA Setup:
At the beginning, I first referred to Quorum's official documents to implement a high-availability deployment, but they didn't provide specific examples; thus, we must be creative to fill the gaps in this architecture diagram.
When I first saw this picture, I believe everyone will has the same ideas as mine: Q1–1 and Q1–2 in the figure should be the same node and use the same pv (because Quorum needs to store block data, so pv is needed). Then use deployment to create two Quorum Nodes Pods, and bound in the same pv, so that two Quorum nodes will share the same coinbase and the same enode ID. And achieved the effect of redundancy. But the problem was quickly discovered after the implementation. Because Quorum will open a LevelDB at the bottom layer. This LevelDB does not support multiple processes to read and write at the same time, so multiple Pods cannot be bound to the same PV. So we have to find another way.
The second idea improves the practice of the first idea. I separate the two PVs and give two deployment bound different PVs, but still share the same coinbase and enode ID to solve the problem of simultaneous reading and writing of multiple processes in LevelDB. But after the implementation, another problem was discovered. Because the enode ID is the same, the two nodes of Quorum will not be automatically synchronized, they will be regarded as the same node in the p2p connection and will not synchronize data. So this method did not work later.
The third idea is the final result of practice. Think of Q1–1 and Q1–2 as different nodes, each with a different coinbase, enode ID, and different pv, and then connect these two nodes using p2p. But the constellation nodes behind these two nodes must use the same public and private keys, so that the private state of the nodes can be synchronized. For the detail of deployment process, please refer to my deployment file. (Because the conditions of each infra are different, the deployment file must be adjusted according to the existing infra, and there is no guarantee that one-click deployment is possible)
Finally, when deployed, there will be three Quorum Nodes of the StatefulSet corresponding to the corresponding constellation nodes. The constellation we use here is crux. Originally, this crux project did not support Relation DB to store data. It also uses LevelDB to store data, so we have to add support for MySQL storage to it. In this way, the structure of the official picture can be realized.
Then next is about IPFS HA processing
IPFS Cluster Setup:
The official authority of IPFS Cluster provides a k8s deployment tutorial, but I found that there would be problems if I followed it.🙃
Fortunately, I found that the official provides a docker-compose.yml, so I still provide us with a good direction to disassemble how to deploy IPFS Cluster on k8s. We can see through this yaml file that an IPFS node must be paired with an IPFS cluster node. I put IPFS and IPFS cluster in the same pod (sidecar mode), and then changed it to k8s according to this docker-compose architecture Helm file. Whereas, there is still a pit in the middle, that is, the IPFS cluster node will not actively discover other IPFS cluster nodes. In order to solve this problem, the peerstore file in the IPFS cluster container must be modified and restarted. However, things have not been resolved so smoothly…
The ipfs cluster will perform some initial settings during the restart process; hence, after manually adding other IPFS cluster information into the peerstore, the Pod restarts will clear the peerstore. So, the last method is using the ReadOnly feature of configmap and mount the peerstore into it. Through this process, it will not be cleared after restarting.
When doing the HA architecture design and deployment of the distributed system, you will always find many pits to be stepped on. And there are relatively few resources on the Internet, in some cases, even the open source community cannot provide support. So BSOS, we are here to contribute to everyone. I hope this will inspire you.
Special thanks to Denny & Andrew Yen