Architecture
Armada's design is deliberately constrained. One leader cluster owns writes. Many follower clusters serve reads. Raft makes each cluster fault-tolerant. The result is a system that is easy to reason about and operate.
There is always exactly one leader cluster. Any number of follower clusters attach to it and replicate its data asynchronously. New followers can be added at any time without touching the leader.
Dashed lines = asynchronous pull replication
Accepts all writes. Uses Raft internally to replicate each write across its nodes before acknowledging the client. Acts as the single source of truth for the entire topology.
Pull changes from the leader asynchronously. Serve all reads locally with no cross-cluster round trip. Writes received by a follower are transparently forwarded to the leader.
Every cluster — leader or follower — runs multi-group Raft internally. A write is only confirmed once a majority of nodes have persisted it. This makes each cluster independently fault-tolerant.
A 3-node cluster survives one node failure. A 5-node cluster survives two. Raft guarantees no data loss as long as a majority of nodes are reachable.
Each table is its own Raft group. This allows write throughput to scale with the number of tables rather than being bottlenecked by a single log.
Inter-node replication uses a QUIC-based transport for low-latency, multiplexed streams between cluster members — no head-of-line blocking.
Data in Armada is organized into tables. Each table is an independent namespace with its own Raft group and its own replication stream. Tables can be created and deleted dynamically without affecting other tables.
| Property | Detail |
|---|---|
| Isolation | Each table has its own Raft group — a failure in one table does not affect others. |
| Consistency scope | All API guarantees (linearizable reads, transactions) are scoped to a single table. |
| Replication unit | The cross-cluster replication stream is per-table, enabling fine-grained follower lag monitoring. |
| Storage engine | Table data is persisted in Pebble (an LSM-tree storage engine), providing efficient range scans. |
| MVCC | Writes are stamped with a monotonically increasing version derived from the Raft log index, enabling snapshot-isolated reads. |
From a client write to a follower read — here's what happens at each step.
The write arrives at any leader node via gRPC. If the receiving node is not the Raft leader for that table, it forwards internally.
The entry is appended to the Raft log and replicated to a majority of leader nodes. Once committed, the FSM applies it to Pebble.
The gRPC response is sent only after the write is durably committed by a quorum. No data loss on leader failure.
Each follower maintains a replication stream per table, pulling committed entries from the leader's LogServer over gRPC.
The follower re-proposes each entry into its own per-table Raft group. Raft replicates within the follower cluster too.
Reads at the follower are served from its local Pebble store — zero cross-cluster latency.
The Architecture doc covers Raft internals, the QUIC transport fork, MVCC versioning, and more.
Deep-dive Architecture Docs →