babel-plumtreekv-demo

PlumtreeKV — a replicated key-value store built on Babel

A software-only demo of the Babel distributed-protocols framework: a peer-to-peer, multi-writer replicated key-value store where every participant is a peer. Set a key in your browser and watch it spread to every node by gossip; because each key is last-writer-wins, all nodes converge to the identical registry no matter what order the updates arrive in — and each node reads its copy locally, with no coordinator. This is the canonical use of Plumtree (cf. Riak’s cluster-metadata store, Leapsight’s plum_db, the c12s edge platform).

Unlike the chat demo (babel-demo) which runs on simplified teaching protocols, PlumtreeKV composes the real ParadigmShift overlay stack — so it doubles as a live showcase of those protocols working together:

  1. HyParView — partial-view membership (the gossip overlay) with LAN auto-discovery;
  2. MultiPlumtree / Plumtreepush-lazy-push multicast tree dissemination: each update is eager-pushed along an embedded spanning tree and lazy-pushed (IHAVE) along the rest of the overlay, which both recovers losses and heals the tree;
  3. PlumtreeKVApp — the replicated registry + web UI on top.

A ParadigmShift artefact. PlumtreeKV is an internal ParadigmShift tech demo, free for non-commercial use — see License.


Quickstart

Requires Java 17+. Build it (mvn package) — the jar lands at target/babel-plumtree-demo.jar — or download it from the latest release (then drop the target/ from the commands below). Run one node — it auto-opens its UI:

java -jar target/babel-plumtree-demo.jar          # one node; auto-opens its UI at http://localhost:8000/

That gives you a registry you can write to locally. To connect more nodes, bootstrap the overlay one of two ways (a joining node only needs to reach one node that’s already in it):

Explicit contact — works on one machine, across networks, anywhere TCP connects; no discovery to configure:

# first node — the one others contact
java -jar target/babel-plumtree-demo.jar babel.address=127.0.0.1 babel.port=6000 HyParView.contact=none
# second node — dials the first to join
java -jar target/babel-plumtree-demo.jar babel.address=127.0.0.1 babel.port=6010 HyParView.contact=127.0.0.1:6000

Multicast auto-discovery — for peers on the same LAN (typically different machines); no addresses to type, but best-effort and opt-in (name the discovery protocol on the command line):

DISC=babel.discovery=pt.unl.fct.di.novasys.babel.core.protocols.discovery.MulticastDiscoveryProtocol
java -jar target/babel-plumtree-demo.jar $DISC                                                  # machine A
java -jar target/babel-plumtree-demo.jar $DISC babel.port=6010 babel.discovery.unicast.port=1027   # 2nd node, same host

Each node serves a web UI and opens your browser at it on startup (plumtreekv.ui.open=false suppresses that). Run several nodes on one machine with a distinct babel.port spaced by ≥ 10; the dissemination protocol shares HyParView’s channel (no second port), and the web-UI port follows automatically (babel.port + 2000).

Discovery tip. For several nodes on one machine, use explicit contact (above) — multicast usually does not loop back between local processes, and is often blocked by VPNs, firewalls, multiple NICs, or macOS’ Local Network permission. Reach for multicast only across real LAN hosts; if peers still don’t find each other, fall back to explicit contact.

Open two nodes’ UIs side by side, write keys on each, and watch the registry converge — each row is colour-coded by the writer that owns it, so the per-writer trees are visible at a glance. Click any row to load its key and value back into the editor, for a quick edit or overwrite.


How it works

A write is one ConfigOpkey k becomes value v (or a delete) — stamped with the writer’s wall-clock time, its origin id, and a unique op id. The flow:

  1. You set/delete a key in the UI (or the headless workload picks one). PlumtreeKVApp issues a BroadcastRequest to the dissemination protocol.
  2. The update is eager-pushed down the spanning tree to neighbours and lazy-announced (IHAVE) along the rest of the overlay; a node missing an announced update grafts the announcer to fetch it, repairing the tree.
  3. On every node (including the origin) the op arrives as a BroadcastDelivery and is applied to the Registry under last-writer-wins: a key keeps whichever op has the highest (timestamp, originId, opId); a delete is a tombstone that an older set cannot resurrect.

Because that order is total and deterministic, any two nodes that have applied the same set of ops hold the identical value for every key — convergence is purely a function of dissemination completeness, which makes the registry a clean correctness oracle (a 64-bit digest over the live entries).

Two trees, one binary. plumtreekv.protocol selects the dissemination protocol:

Self-healing, no anti-entropy. Plumtree’s lazy push is its recovery: a missed update is announced via IHAVE, and the receiver grafts it — so a late joiner, or the survivors after a node dies, recover what they missed and reach full convergence on a connected overlay without a separate reconciliation protocol.

Snapshot sync. On joining, a node asks one neighbour (point-to-point over HyParView’s channel) for the current registry, so a late joiner sees existing state immediately instead of waiting for new writes.


Configuration

Every value can come from babel_config.properties (bundled) or be overridden on the command line as key=value — which also makes the demo easy to script for automated, headless runs.

Process-wide & overlay

Property Default Description
babel.port 6000 TCP port HyParView binds (the dissemination protocol shares it). Space local nodes by ≥ 10.
babel.interface / babel.address auto / — NIC to bind/announce on, or an explicit IP. No loopback default — use babel.address=127.0.0.1 for several nodes on one disconnected machine.
babel.discovery (unset) Opt-in multicast LAN auto-discovery — set on the command line to pt.unl.fct.di.novasys.babel.core.protocols.discovery.MulticastDiscoveryProtocol. Off by default; bootstrap is via HyParView.contact.
babel.discovery.unicast.port 1026 Per-process discovery socket — only when multicast is enabled; distinct per local node.
HyParView.contact (absent) Bootstrap: none = first node; host:port = dial that node to join; absent = wait for discovery (only useful with multicast on).
HyParView.ActiveView / PassiveView / … 5 / 10 / … HyParView view sizes (keep active ≥ 4 so the spanning tree stays connected) and walk lengths — see the config file.
MultiPlumtree.LazyTickPeriod 1000 Period (ms) at which lazy IHAVE announcements are flushed/retried — bounds tree-repair latency.
MultiPlumtree.PeerAddressResolution shared Disseminate over HyParView’s channel (shared) rather than opening a second one of its own.

PlumtreeKV application (plumtreekv.*)

Property Default Description
plumtreekv.protocol multi Dissemination protocol: multi (MultiPlumtree, per-writer trees) or single (Plumtree, one shared tree).
plumtreekv.ui.enabled true Serve the web UI.
plumtreekv.ui.port babel.port + 2000 Web UI port.
plumtreekv.ui.open true Open the system browser at the UI on startup (best-effort; set false to suppress, e.g. when running many local nodes).
plumtreekv.snapshot.sync true Fetch the current registry from a neighbour on join.
plumtreekv.digest.interval 5000 Period (ms) of convergence-digest telemetry; ≤ 0 disables.
plumtreekv.workload.enabled false Headless random-write driver (no UI needed).
plumtreekv.workload.rate / .keyspace / .duration 2 / 16 / 0 Writes/sec, number of distinct keys, run length (ms; 0 = unbounded).
plumtreekv.workload.startDelay 5000 Warm-up delay (ms) before writing begins, so the overlay can form first. Used in non-control mode (ignored when a control file is set).
plumtreekv.workload.controlFile (unset) Path to a shared control file for scripted runs (overrides startDelay). The file holds one token: RUN makes every node begin writing together, STOP ends it; any other value — conventionally WAIT — means keep waiting. This lets an external driver coordinate a synchronized start and a clean drain of in-flight messages before results are read.
plumtreekv.workload.controlPollMs 250 How often (ms) the control file is polled (control mode only).

Telemetry & validation

Alongside the human-readable protocol log (babel-plumtree-demo-<port>.log), each node writes a small machine-readable telemetry file (babel-plumtree-demo-telemetry-<port>.log) — one structured event per line (START / SET / DELIVER / DIGEST / NEIGHBOR_* / WRITE_START / WORKLOAD_STOP / SYNC_MERGE). That’s enough for an external driver to verify, automatically, that the demo really does what it claims: that every write reaches every node (coverage), how fast (latency, last-delivery hop), and that all nodes converge to the same registry — including, after a node failure, that the survivors still converge (tree repair).


Building from source

mvn package          # → target/babel-plumtree-demo.jar (fat JAR, mainClass Main)

Depends on the ParadigmShift Babel libraries — babel-core, babel-protocols-common, hyparview, and plumtree (the Plumtree / MultiPlumtree protocol library) — from the ParadigmShift Maven repository, which the local build and CI resolve directly.

Project layout

src/main/java/
  Main.java                              wiring: HyParView + MultiPlumtree|Plumtree + PlumtreeKVApp
  protocols/apps/registry/
    PlumtreeKVApp.java                    the application protocol (slot 300)
    ConfigOp.java                         one set/delete op + last-writer-wins order
    ConfigPayload.java                    ConfigOp ⇄ broadcast bytes
    Registry.java                         LWW key→value store + convergence digest + snapshot
    messages/RegistrySyncMessage.java     point-to-point snapshot request/reply
    telemetry/Telemetry.java              structured telemetry events
    timers/{DigestTimer,WorkloadTimer,ControlTimer}.java
    ui/WebUi.java                         embedded HTTP server (JDK built-in)
  utils/InterfaceToIp.java                bind-address resolution (shared with babel-demo)
src/main/resources/
  babel_config.properties, log4j2.xml, web/{index.html,app.js,style.css}

Distribution

PlumtreeKV is a runnable demo, not a library — it is never deployed to the ParadigmShift Maven repository. CI builds the fat JAR and attaches it to a GitHub Release on a v*.*.* tag, and publishes the API docs to GitHub Pages.

Credits & further reading

PlumtreeKV is a ParadigmShift tech demo. The Babel framework and the protocol implementations it builds on — Babel itself, the HyParView membership protocol, and the Plumtree epidemic-broadcast-tree — were originally developed at NOVA LINCS, in the TaRDIS European project, by the Computer Systems Group at NOVA FCT. The versions used here are ParadigmShift’s own, provided and evolved independently of that original work.

The protocols underpinning this demo are described in:

The MultiPlumtree (per-writer-tree) variant additionally adopts design elements — per-root trees and the lazy re-advertisement / acknowledgement model — from Riak’s riak_core_broadcast, originally implemented by Jordan West (repo). No code from that project is included; see the plumtree library for the full acknowledgement.

License

ParadigmShift Proprietary License — non-commercial use permitted; commercial use requires a written licence from ParadigmShift, Lda. See LICENSE.