Armada KVArmada KV

Architecture

Simple model, strong guarantees

Armada's design is deliberately constrained. One leader cluster owns writes. Many follower clusters serve reads. Raft makes each cluster fault-tolerant. The result is a system that is easy to reason about and operate.

Hub-and-Spoke Topology

There is always exactly one leader cluster. Any number of follower clusters attach to it and replicate its data asynchronously. New followers can be added at any time without touching the leader.

LeaderFollowerFollowerFollower

Dashed lines = asynchronous pull replication

Leader cluster

Accepts all writes. Uses Raft internally to replicate each write across its nodes before acknowledging the client. Acts as the single source of truth for the entire topology.

Follower clusters

Pull changes from the leader asynchronously. Serve all reads locally with no cross-cluster round trip. Writes received by a follower are transparently forwarded to the leader.

Raft Within Each Cluster

Every cluster — leader or follower — runs multi-group Raft internally. A write is only confirmed once a majority of nodes have persisted it. This makes each cluster independently fault-tolerant.

Majority quorum

A 3-node cluster survives one node failure. A 5-node cluster survives two. Raft guarantees no data loss as long as a majority of nodes are reachable.

Multi-group

Each table is its own Raft group. This allows write throughput to scale with the number of tables rather than being bottlenecked by a single log.

QUIC transport

Inter-node replication uses a QUIC-based transport for low-latency, multiplexed streams between cluster members — no head-of-line blocking.

Tables — Isolated Keyspaces

Data in Armada is organized into tables. Each table is an independent namespace with its own Raft group and its own replication stream. Tables can be created and deleted dynamically without affecting other tables.

PropertyDetail
IsolationEach table has its own Raft group — a failure in one table does not affect others.
Consistency scopeAll API guarantees (linearizable reads, transactions) are scoped to a single table.
Replication unitThe cross-cluster replication stream is per-table, enabling fine-grained follower lag monitoring.
Storage engineTable data is persisted in Pebble (an LSM-tree storage engine), providing efficient range scans.
MVCCWrites are stamped with a monotonically increasing version derived from the Raft log index, enabling snapshot-isolated reads.

Write Path, End to End

From a client write to a follower read — here's what happens at each step.

  1. 1

    Client sends a Put to the leader

    The write arrives at any leader node via gRPC. If the receiving node is not the Raft leader for that table, it forwards internally.

  2. 2

    Raft replicates within the leader cluster

    The entry is appended to the Raft log and replicated to a majority of leader nodes. Once committed, the FSM applies it to Pebble.

  3. 3

    Leader acknowledges the client

    The gRPC response is sent only after the write is durably committed by a quorum. No data loss on leader failure.

  4. 4

    Followers poll for new log entries

    Each follower maintains a replication stream per table, pulling committed entries from the leader's LogServer over gRPC.

  5. 5

    Follower proposes entries into its own Raft

    The follower re-proposes each entry into its own per-table Raft group. Raft replicates within the follower cluster too.

  6. 6

    Local client reads from the follower

    Reads at the follower are served from its local Pebble store — zero cross-cluster latency.

Want the full technical details?

The Architecture doc covers Raft internals, the QUIC transport fork, MVCC versioning, and more.

Deep-dive Architecture Docs →