Proposal status: Implemented
Summary
Replace the heart of Armada the gihub.com/lni/dragonboat library with github.com/hashicorp/raft library.
Motivation
As the Armada evolved we run into problems with dragonboat library.
- Library is maintained by just a single person.
- It is a closed design library (most stuff implemented in
internalpackages). - Has inadequate API, hides many details and takes the control away from the programmer.
- Snapshots
- Logging
- Configuration
- Transport
- Node discovery
- ID assignment
- Even though
dragonboathas a lot of stars it does not seem to be used much (just tens of imports), in contrasthashicorp/rafthas over 1k imports. - Library is not really modular, replacing Transport is almost impossible, the same with LogDB and Snapshot store.
- The
hashicor/raftis much simpler in design as well as in implementation (dragonboat vs raft sloc)
Design
Challenges
hashicorp/raftis not multi-group raft implementation- that could be mitigated by multiplexing over custom transport ( see. raft-grpc-transport-mux)
- the advantage is that we could pick the group label of our liking (like table name)
hashicorp/raftdoes not have support foron-diskstatemachine impl OOTB- every start of FSM is accompanied by applying the most recent snapshot
- in the case of our table FSM that would lead to large compute and space overhead
- mitigation lies in implementing own SnapshotManager that would serve lightweight snapshot out of the persisted data
- inspiration could be found in Vault
hashicorp/raftdoes not automatically forward proposals to the leader node- that could be implemented using the same layer as the Raft multiplexing
- inspiration could be drawn from Consul
In nutshell
- Basics + Meta
- Adapt Metadata FSM to satisfy
raft.FSM - Add and expose
raft-grpc-transport-muxon internal GRPC server - Back the
raft.Raftinstance bytidwall/raft-walor other impl - Use default File based snapshot storage
- Adapt Metadata FSM to satisfy
- Table FSM
- Adapt Table FSM to satisfy
raft.FSMand (optionally)raft.BatchFSM - Implement Raft leader forward over internal GRPC server
- Implement readIndex (serializable) read
- Use default File based snapshot storage
- Adapt Table FSM to satisfy
- Optimization
- Implement
on-disksnapshotting optimization - Implement readIndex and forwarding pipelining
- Use the aforementioned snapshot store to provide user requested snapshots
- Implement
Alternatives
- Replace
dragonboatwithgithub.io/etcd/raft. ETCD raft is a base of Dragonboat library (dragonboat is to some degree a wrapper/fork of it) and as such does not make matters simpler, on the flip side though it is more powerful thanhashicorp/raft. - Fork
drgaonboat, the attempt was made ingithub.com/coufalja/tugboatbut it was soon discovered that the major overhaul would be needed nevertheless.