brood

Autonomous mesh networking fabric.
Self-registering nodes, zero config distribution, full-mesh WireGuard,
VXLAN/EVPN L2 overlay, BGP routing — all from a single DNS domain.

100nodes / brood
< 1sBFD failure detection
0config files distributed
broods in federation
scroll

concepts

A small vocabulary — everything else follows from these definitions.

brood

A single mesh of up to 100 nodes sharing a WireGuard full-mesh and a common BGP AS number. All nodes run identical container images.

worker

A regular node inside a brood. Runs brood-agent and FRR. Participates in the WireGuard mesh and VXLAN/EVPN overlay.

scout

A border node. Also runs brood-api, serves as BGP route reflector, and holds trails to other broods. Identical image — role assigned at approval time.

⟵⟶

trail

An inter-brood link. A WireGuard tunnel + eBGP session between two scouts from different broods. Symmetric — both sides opt in explicitly.

vlan

A named L2 segment within a brood. Each vlan has a VNI, subnet, and explicit member list. Isolation enforced at the EVPN layer — zero leakage.

DNS state

No central database. All brood state lives in Cloudflare DNS TXT and SRV records. Nodes read via DoH — brood-api writes via the Cloudflare API.

architecture

Two deployment modes — single mesh or federated multi-brood.

Single brood full WireGuard mesh diagram
Full WireGuard mesh — n(n−1)/2 tunnels, single wg0 interface per node. Scouts act as BGP route reflectors. BFD on every session.
Multi-brood federation diagram
Each brood is a separate BGP AS. Scouts hold trails to scouts in other broods. Parallel trails give ECMP load-balancing or primary/backup redundancy.

how it works

A node needs exactly one environment variable to bootstrap.

01

identity

Node generates an Ed25519 keypair on first start. Curve25519 is derived from it for WireGuard — no separate key needed. Node ID is hex(sha256(pubkey))[:10].

node.key → wg private key (derived, no storage)
02

discovery

Node queries _brood._tcp.<domain> SRV — gets the brood-api endpoint(s), sorted by priority. Multiple scouts = automatic HA.

_brood._tcp.brood.example.com SRV 0 100 8443 a3f9c821b0.brood.example.com
03

registration

Node runs STUN (dual-stack IPv4+IPv6), then POSTs its pubkey, hostname, and endpoints to /register over HTTPS. No pre-shared secrets — Ed25519 signature proves identity.

POST /register · {pk, hostname, ep, ts, sig}
04

approval

Operator reviews pending nodes and approves with a role assignment. brood-api derives the overlay IP from the pubkey hash, writes the _n record, and delivers the brood_key via a sealed _k record.

POST /approve/a3f9c821b0 · {role: "worker"}
05

sync loop

Every 30 seconds: read _n records, decrypt peer endpoints, diff WireGuard peers, diff VXLAN interfaces, regenerate FRR config if changed, send heartbeat. Stale nodes (90s silence) are automatically excluded.

_n → decrypt → wg0 diff → vxlan diff → frr reload
06

key rotation

When a node is removed, brood-api generates a new brood_key, re-seals it for every surviving node, and bumps the _meta kv counter. Surviving nodes detect the bump and rotate automatically on the next sync.

DELETE /nodes/a3f9c821b0 → new brood_key → _k per survivor

cryptography

One keypair per node. Everything else is derived or ephemeral.

Ed25519 keypair node.key — persisted Curve25519 WireGuard identity derived, never stored Node ID sha256(pubkey)[:10] DNS hostname prefix Signatures all DNS records all API requests

brood_key

A 32-byte NaCl secretbox key shared among all admitted nodes. Encrypts endpoint blobs in _n records, hiding the mesh topology from outsiders. Rotated on every node removal.

transport

brood-api is served over HTTPS with a Let's Encrypt certificate obtained at bootstrap via Cloudflare DNS-01. No application-layer sealing on top — TLS handles it.

WireGuard

Peers derive each other's Curve25519 pubkey from the Ed25519 pubkey in the _n record — deterministically, no out-of-band exchange needed.

DNS as state store

No database. No etcd. No consul. Just TXT and SRV records.

recordnamecontentpurpose
SRV _brood._tcp 0 100 8443 <id10>.<domain> API discovery · one per scout · HA + priority
_n _n.<domain> pk=… ep=<secretbox> ts=… sig=… admitted nodes · ep encrypted with brood_key
_p _p.<domain> pk=… ep=<b64 JSON> ts=… sig=… pending nodes · awaiting operator approval
_k _k.<id10>._n <NaCl sealed box> brood_key delivery · short TTL · deleted after pickup
_vl _vl.<domain> pk=vni<N> vni=N name=… data=<secretbox> vlan definitions · one per vlan
_m _m.<vni>.<domain> pk=… data=<secretbox> vlan members · one per node/vlan pair
_t _t.<domain> remote=… scout=… ep=… overlay=… priority=N inter-brood trails · one per remote brood
_meta _meta.<domain> asn=… subnet=… bgp=rr kv=… keepalive=… brood-wide config · version counter for key rotation

L2 overlay · vlans

Named L2 segments with explicit membership. Isolated at the EVPN layer.

vlan 100 · prod · 10.100.0.0/24
web01
.1
web02
.2
db01
.4
vlan 200 · storage · 10.200.0.0/24
db01
.1
stor01
.2
web01 ↔ stor01: WireGuard tunnel exists, zero EVPN routes exchanged

per-VNI isolation

Each vlan gets its own VNI. RT is auto-derived as <ASN>:<VNI>. A node only receives MAC routes for vlans it is a member of — zero leakage across segments.

ARP suppression

FRR advertises SVI IPs via advertise-svi-ip. Linux bridges have neigh_suppress enabled. zebra answers ARP from the EVPN FDB — no flooding, near-zero ARP traffic.

multi-vlan nodes

A node can belong to multiple vlans simultaneously, each with a different IP. The operator assigns IPs explicitly at membership time via POST /vlans/<vni>/members.

quick reference

Bootstrap, deploy, and operate a brood.

First scout — one-time bootstrap

# Set env and run bootstrap
BROOD_DOMAIN=brood.example.com \
BROOD_CF_TOKEN=$CF_TOKEN \
BROOD_CF_ZONE_ID=$CF_ZONE_ID \
BROOD_ASN=65001 \
brood-agent bootstrap

# Bootstrap will:
#  1. generate node.key + brood.key
#  2. STUN — discover public endpoint
#  3. create A record: <id10>.brood.example.com
#  4. obtain TLS cert via Let's Encrypt DNS-01
#  5. write SRV: _brood._tcp.brood.example.com
#  6. write _meta, _n, _k records
#  7. start brood-api + brood-agent + FRR

Additional scouts

# Additional scouts register normally, then get approved with role=scout.
# Distribute brood.key out-of-band (quadlet env / k8s secret).
curl -X POST https://brood.example.com:8443/approve/b7d2e14f33 \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"role": "scout"}'

Quadlet (systemd, recommended)

# /etc/brood/brood.env
BROOD_DOMAIN=brood.example.com
BROOD_API_TOKEN=secret            # scouts only
BROOD_CF_TOKEN=cf_token          # scouts only
BROOD_CF_ZONE_ID=zone_id         # scouts only

# Worker: 2 containers
/etc/containers/systemd/brood-agent.container
/etc/containers/systemd/brood-frr.container

# Scout: 3 containers (add brood-api)
/etc/containers/systemd/brood-api.container

Ansible playbook

# deploy worker
mise run play pb/brood.yml -i inv/put \
  -l myhost -e brood_role=worker

# deploy scout
mise run play pb/brood.yml -i inv/put \
  -l scout1 -e brood_role=scout

# nuke everything
mise run play pb/brood.yml -i inv/put \
  -l myhost --tags nuke

Kubernetes DaemonSet

kubectl apply -f deploy/k8s/daemonset.yaml

Node lifecycle

# List pending nodes
curl https://brood.example.com:8443/pending \
  -H "Authorization: Bearer $TOKEN"

# Approve a node
curl -X POST https://brood.example.com:8443/approve/a3f9c821b0 \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"role": "worker"}'

# Remove a node (triggers key rotation)
curl -X DELETE https://brood.example.com:8443/nodes/a3f9c821b0 \
  -H "Authorization: Bearer $TOKEN"

# List all members
curl https://brood.example.com:8443/members \
  -H "Authorization: Bearer $TOKEN"

Vlan management

# Create a vlan
curl -X POST https://brood.example.com:8443/vlans \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"vni": 100, "name": "prod", "subnet": "10.100.0.0/24"}'

# Add a member
curl -X POST https://brood.example.com:8443/vlans/100/members \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"id": "a3f9c821b0", "ip": "10.100.0.1"}'

# List members
curl https://brood.example.com:8443/vlans/100/members \
  -H "Authorization: Bearer $TOKEN"

Adding a trail between two broods

# Both sides must opt in. Run on brood-a:
curl -X POST https://brood-a.example.com:8443/trails \
  -H "Authorization: Bearer $TOKEN_A" \
  -d '{"remote_domain": "brood-b.example.com", "priority": 100}'

# Run on brood-b:
curl -X POST https://brood-b.example.com:8443/trails \
  -H "Authorization: Bearer $TOKEN_B" \
  -d '{"remote_domain": "brood-a.example.com", "priority": 100}'

# Trail becomes active on next agent sync (≤30s).
# WireGuard peer added, BFD + eBGP session established.

Parallel trails (ECMP / primary-backup)

# Equal priority → ECMP load balancing
curl -X POST .../trails -d '{"remote_domain": "brood-b.example.com", "priority": 100}'
curl -X POST .../trails -d '{"remote_domain": "brood-b.example.com", "priority": 100}'

# Different priority → primary/backup
curl -X POST .../trails -d '{"remote_domain": "brood-b.example.com", "priority": 100}'
curl -X POST .../trails -d '{"remote_domain": "brood-b.example.com", "priority": 50}'

Removing a trail

# Remove from both sides for clean teardown
curl -X DELETE https://brood-a.example.com:8443/trails/brood-b.example.com \
  -H "Authorization: Bearer $TOKEN_A"