Architecture Maps

Fly.io Platform Architecture

Interactive architecture map of Fly.io's global application platform — Firecracker microVMs, Anycast networking, CRDT state replication, and the systems that make apps run close to users.

Public Sources Only Firecracker / KVM Rust + Go Updated: Mar 2026
01

Platform Overview

Fly.io runs user applications inside Firecracker microVMs on bare-metal servers in 30+ regions worldwide. BGP Anycast routes traffic to the nearest datacenter, where a Rust proxy layer forwards requests to the right VM. The platform replaced HashiCorp Consul and Nomad with custom-built alternatives: Corrosion for distributed state, flyd for orchestration.

~125ms
VM Boot Time
<5 MiB
Per-VM Overhead
150/s
VMs Per Host
30+
Regions
~1-2s
Global State Sync
End-to-End Request Flow
graph LR
    CLIENT["Client"] --> ANYCAST["BGP Anycast
IP Routing"] ANYCAST --> PROXY["Fly Proxy
(Rust)"] PROXY --> CORR["Corrosion
(Service Discovery)"] PROXY --> VM["Firecracker
MicroVM"] VM --> VOL["NVMe
Volume"] VM --> SPN["6PN Private
Network"] SPN --> PG["Fly Postgres"] style CLIENT fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style ANYCAST fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style PROXY fill:#1A1614,stroke:#D4220A,color:#E8DDD0 style CORR fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style VM fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VOL fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style SPN fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style PG fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0
02

Firecracker MicroVM Runtime

Every application workload runs inside a Firecracker microVM — a lightweight virtual machine created by Amazon that uses Linux KVM for hardware-level isolation. Firecracker exposes only 5 emulated devices to guests, providing a minimal attack surface compared to full QEMU virtualization.

OCI Image to MicroVM Pipeline
graph TD
    subgraph Build["Image Source"]
        OCI["OCI / Docker
Image"] end subgraph Convert["Conversion Pipeline"] REG["Docker Registry
API Pull"] CTD["containerd
Snapshotter"] LVM["LVM2 Thin
Provisioning"] COW["Copy-on-Write
Snapshot"] end subgraph Runtime["Firecracker Runtime"] INIT["/init (Rust)
Custom Init Process"] NET["DNS + 6PN
Configuration"] SSH["WireGuard
SSH Server"] APP["Application
Entrypoint"] end subgraph Devices["5 Emulated Devices"] VN["virtio-net"] VB["virtio-block"] VS["virtio-vsock"] SC["Serial Console"] KB["Keyboard Ctrl"] end OCI --> REG --> CTD --> LVM --> COW COW --> INIT INIT --> NET INIT --> SSH INIT --> APP INIT --> Devices style OCI fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style INIT fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style CTD fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style LVM fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style APP fill:#1A1614,stroke:#D4220A,color:#E8DDD0
Performance Profile

First deploy is slower due to image pull and snapshot creation. Subsequent boots reuse cached LVM snapshots and complete in ~125ms to user-space code, ~300ms for full Machine startup. Host hardware runs 8–32 physical CPU cores with 32–256GB RAM. RootFS limit: 8GB for non-GPU Machines.

KVM Isolation

Hardware-level virtualization via Linux KVM — each VM runs its own kernel with no shared userspace

Compute

Minimal Attack Surface

Only 5 emulated devices exposed to guests, compared to dozens in full QEMU

Security

Best-Effort CPU

Cores only do work for one microVM at a time — no CPU steal time between tenants

Compute
03

Anycast Networking & Fly Proxy

Fly.io announces global IPv4 and IPv6 address blocks from all datacenters using BGP Anycast. Standard internet routing delivers clients to the nearest datacenter. Fly Proxy, a Rust-based proxy on every server, matches incoming connections to customer apps and forwards traffic.

Anycast Routing & Proxy Layer
graph TD
    subgraph Internet["Internet"]
        C1["Client
(US West)"] C2["Client
(Europe)"] C3["Client
(Asia)"] end subgraph DC1["Datacenter: LAX"] P1["Fly Proxy"] VM1["App Machine"] end subgraph DC2["Datacenter: AMS"] P2["Fly Proxy"] VM2["App Machine"] end subgraph DC3["Datacenter: NRT"] P3["Fly Proxy"] VM3["App Machine"] end C1 -->|"Anycast"| P1 C2 -->|"Anycast"| P2 C3 -->|"Anycast"| P3 P1 --> VM1 P2 --> VM2 P3 --> VM3 P1 <-->|"WireGuard
Backhaul"| P2 P2 <-->|"WireGuard
Backhaul"| P3 style C1 fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style C2 fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style C3 fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style P1 fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style P2 fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style P3 fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style VM1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM3 fill:#241F1C,stroke:#F0A500,color:#E8DDD0

Fly Proxy Responsibilities

TLS Termination

Automatic TLS for all public apps, with protocol handlers for HTTP, Postgres (pg_tls), and PROXY protocol

Load Balancing

Routes based on concurrency settings, current machine load, and geographic proximity via RTT measurements

Autostart / Autostop

Automatically starts Machines on incoming requests and stops them when idle — scale-to-zero support

fly-replay Header

Response header that redirects requests to other regions, Machines, or apps for write-leader routing patterns

04

6PN Private Networking

Every Fly.io organization gets a 6PN (IPv6 Private Network) — a full WireGuard mesh connecting all applications within the org. Each microVM receives a unique IPv6 address encoding the app ID, org ID, and host hardware identifier.

WireGuard Mesh Architecture
graph TD
    subgraph Org["Organization Private Network (6PN)"]
        subgraph H1["Host A"]
            VM1["App 1
Machine"] VM2["App 2
Machine"] end subgraph H2["Host B"] VM3["App 1
Replica"] VM4["App 3
Machine"] end WG["WireGuard
Cryptokey Routing"] EBPF["eBPF
Access Control"] DNS[".internal DNS
Discovery"] end EXT["External Client
(WireGuard VPN)"] VM1 <--> WG VM2 <--> WG VM3 <--> WG VM4 <--> WG WG --> EBPF DNS --> WG EXT -->|"VPN Peer"| WG style VM1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM3 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM4 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style WG fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style EBPF fill:#1A1614,stroke:#D4220A,color:#E8DDD0 style DNS fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style EXT fill:#D4220A,stroke:#E8650A,color:#E8DDD0
No Dynamic Routing Protocol

6PN uses WireGuard's cryptokey routing to handle packet delivery without a dynamic routing protocol. eBPF programs enforce access control and statically route IPv6 packets along the mesh. DNS-based discovery via .internal addresses enables apps to locate other instances by region or app name.

05

Orchestration — flyd

flyd is Fly.io's Go-based orchestrator daemon running on every physical worker server. It replaced HashiCorp Nomad, which was abandoned because its bin-packing, asynchronous scheduling, and centralized consensus didn't fit Fly.io's geographic distribution and scale-from-zero requirements.

flyd Worker Architecture
graph TD
    subgraph Worker["Physical Worker Server"]
        FLYD["flyd
(Go Daemon)"] BOLT["BoltDB
(Append-Only Log)"] CTD["containerd
(Image Conversion)"] subgraph VMs["Firecracker MicroVMs"] VM1["Machine A"] VM2["Machine B"] VM3["Machine C"] end end NATS["NATS
(Coordination)"] FLAPS["flaps
(API Proxy)"] FLAPS -->|"Schedule"| NATS NATS -->|"Reserve"| FLYD FLYD --> BOLT FLYD --> CTD FLYD --> VM1 FLYD --> VM2 FLYD --> VM3 style FLYD fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style BOLT fill:#1A1614,stroke:#F0A500,color:#E8DDD0 style NATS fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style FLAPS fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style VM1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM3 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style CTD fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0

Local Source of Truth

Each flyd instance is its own authority for local VM state — no consensus protocol between instances

Orchestration

State Machine Design

Rigidly structured as Go generics-based state machines for operations like "create a machine" or "delete a volume"

Orchestration

Market-Based Scheduling

Immediate-or-cancel orders: requests bid for resources, workers supply. Best-fit ranking via linear interpolation across utilization. Fails fast, never queues.

Orchestration
06

Corrosion — CRDT Cluster State

Corrosion is Fly.io's custom Rust service that replaced HashiCorp Consul. It propagates a SQLite database across all cluster nodes using CRDTs (Conflict-free Replicated Data Types), achieving eventual consistency without consensus, central servers, or locking.

Corrosion Replication Architecture
graph TD
    subgraph Node1["Node A"]
        SQL1["SQLite +
cr-sqlite"] CHG1["crsql_changes
Table"] end subgraph Node2["Node B"] SQL2["SQLite +
cr-sqlite"] CHG2["crsql_changes
Table"] end subgraph Node3["Node C"] SQL3["SQLite +
cr-sqlite"] CHG3["crsql_changes
Table"] end SWIM["SWIM Protocol
(Membership)"] QUIC["QUIC Transport
(Gossip + Sync)"] SQL1 --> CHG1 SQL2 --> CHG2 SQL3 --> CHG3 CHG1 <-->|"Gossip"| QUIC CHG2 <-->|"Gossip"| QUIC CHG3 <-->|"Gossip"| QUIC QUIC --> SWIM PROXY["Fly Proxy"] -->|"Query"| SQL1 PROXY -->|"Query"| SQL2 style SQL1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style SQL2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style SQL3 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style QUIC fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style SWIM fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style PROXY fill:#D4220A,stroke:#E8650A,color:#E8DDD0
Design Philosophy

Inspired by link-state routing protocols (like OSPF) rather than consensus protocols (like Raft). Nodes randomly ping subsets of peers; failed heartbeats are marked "suspect" and validated by other random nodes for rapid convergence. Last-write-wins with logical (causal) timestamps resolves conflicts.

Component Technology Purpose
cr-sqlite SQLite Extension CRDT-aware change tracking in crsql_changes table
SWIM Membership Protocol Cluster membership management and failure detection
QUIC Transport Broadcasting changes and reconciling state for new nodes
LWW Registers CRDT Type Last-write-wins resolution with causal timestamps
07

Machines API — flaps

flaps is the stateless API proxy that brokers between client requests and regional flyd instances. It implements the Machines REST API for full VM lifecycle control: create, start, stop, destroy.

Machine Creation Flow
graph LR
    CLI["flyctl
(CLI)"] --> FLAPS["flaps
(API Proxy)"] FLAPS --> AUTH["Central DB
(Virginia)"] FLAPS --> NATS["NATS
(Worker Query)"] NATS --> W1["Worker 1
flyd"] NATS --> W2["Worker 2
flyd"] NATS --> W3["Worker 3
flyd"] FLAPS -->|"Best-fit
Ranking"| W2 W2 --> VM["New
Machine"] style CLI fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style FLAPS fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style AUTH fill:#1A1614,stroke:#F0A500,color:#E8DDD0 style NATS fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style W1 fill:#241F1C,stroke:#BFA58A,color:#E8DDD0 style W2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style W3 fill:#241F1C,stroke:#BFA58A,color:#E8DDD0 style VM fill:#D4220A,stroke:#F0A500,color:#E8DDD0

Machine Lifecycle States

State Transitions
graph LR
    CREATED["created"] --> STARTED["started"]
    STARTED --> STOPPED["stopped"]
    STOPPED --> STARTED
    STOPPED --> DESTROYED["destroyed"]
    STARTED --> DESTROYED

    style CREATED fill:#1A1614,stroke:#BFA58A,color:#E8DDD0
    style STARTED fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0
    style STOPPED fill:#1A1614,stroke:#F0A500,color:#E8DDD0
    style DESTROYED fill:#1A1614,stroke:#D4220A,color:#E8DDD0
                    
Machine Pinning

Machines are pinned to specific physical hardware. For start/stop operations, flaps knows exactly which host to contact — no capacity query needed. Creation involves reserving space, fetching the container image, and building a root filesystem (low double-digit seconds). Subsequent starts are subsecond.

08

Volume & Persistent Storage

Fly Volumes are slices of NVMe drives on the same physical server as the Machine. They use Linux LVM thin provisioning, providing local (not network-attached) storage with one-to-one Machine mapping.

Volume Architecture
graph TD
    subgraph Server["Physical Server"]
        subgraph NVMe["NVMe Drive Pool"]
            POOL["LVM Thin Pool"]
            LV1["Logical Vol 1"]
            LV2["Logical Vol 2"]
        end

        FLYD["flyd
(Orchestrator)"] subgraph Jails["Firecracker Jails"] VM1["Machine A"] --> LV1 VM2["Machine B"] --> LV2 end end SNAP["Object Storage
(Daily Snapshots)"] POOL --> LV1 POOL --> LV2 LV1 -->|"Daily"| SNAP LV2 -->|"Daily"| SNAP FLYD --> VM1 FLYD --> VM2 style POOL fill:#1A1614,stroke:#F0A500,color:#E8DDD0 style LV1 fill:#241F1C,stroke:#E8650A,color:#E8DDD0 style LV2 fill:#241F1C,stroke:#E8650A,color:#E8DDD0 style FLYD fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style VM1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VM2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style SNAP fill:#1A1614,stroke:#BFA58A,color:#E8DDD0
Property Detail
Storage Type Local NVMe — not network-attached, not replicated
Mapping One volume per Machine, one Machine per volume
Performance Slightly faster than AWS EBS (local NVMe)
Snapshots Daily to object storage, retained 5 days
Migration Block-level cloning enables fast volume migration to new hosts
Mount Process flyd looks up logical volume, recreates block device node in Firecracker jail
09

Fly Postgres

Fly Postgres runs PostgreSQL inside Fly Machines with volume-backed persistent storage. Positioned as "unmanaged" — Fly provisions it but users handle maintenance. HA is achieved via streaming replication with automatic failover.

HA Postgres Architecture
graph TD
    subgraph Cluster["Postgres Cluster"]
        subgraph Primary["Primary Region"]
            PG1["Primary Postgres
(Read/Write)"] VOL1["NVMe Volume"] end subgraph Replica1["Replica Region A"] PG2["Replica Postgres
(Read-Only)"] VOL2["NVMe Volume"] end subgraph Replica2["Replica Region B"] PG3["Replica Postgres
(Read-Only)"] VOL3["NVMe Volume"] end end MGR["repmgr
(Cluster Manager)"] WAL["WAL Streaming
Replication"] PG1 --> VOL1 PG2 --> VOL2 PG3 --> VOL3 PG1 -->|"WAL"| PG2 PG1 -->|"WAL"| PG3 MGR --> PG1 MGR --> PG2 MGR --> PG3 PROXY["Fly Proxy
(pg_tls handler)"] --> PG1 style PG1 fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style PG2 fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style PG3 fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style MGR fill:#1A1614,stroke:#F0A500,color:#E8DDD0 style PROXY fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style VOL1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style VOL2 fill:#241F1C,stroke:#BFA58A,color:#E8DDD0 style VOL3 fill:#241F1C,stroke:#BFA58A,color:#E8DDD0
Legacy vs Current

Legacy deployments used Stolon (Go-based manager) with three components per VM: sentinel, keeper, and proxy, backed by Consul KV. Newer deployments use repmgr for simpler replication cluster management. Fly Proxy provides a specialized pg_tls handler for Postgres connections.

10

Builder Infrastructure

When users run fly deploy, images are built remotely using dedicated builder Machines. The builder is an independent Fly app auto-created per organization on first use, with attached NVMe volumes for layer caching.

Deploy Pipeline
graph LR
    subgraph Client["Developer"]
        FLYCTL["flyctl"]
    end

    subgraph Build["Build Infrastructure"]
        BUILDER["Builder Machine
(BuildKit)"] CACHE["NVMe Cache
(Layers)"] end subgraph Registry["Image Distribution"] REG["registry.fly.io
(Custom Registry)"] MAC["Macaroon
Auth"] end subgraph Deploy["Deployment"] FLAPS2["flaps API"] FLYD2["flyd Worker"] VM2["New Machine"] end FLYCTL --> BUILDER BUILDER --> CACHE BUILDER --> REG FLYCTL --> MAC MAC --> REG FLYCTL --> FLAPS2 FLAPS2 --> FLYD2 FLYD2 -->|"Pull Image"| REG FLYD2 --> VM2 style FLYCTL fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style BUILDER fill:#1A1614,stroke:#F0A500,color:#E8DDD0 style REG fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style MAC fill:#1A1614,stroke:#D4220A,color:#E8DDD0 style FLAPS2 fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style VM2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style FLYD2 fill:#241F1C,stroke:#E8650A,color:#E8DDD0

Custom Registry Features

Fly.io's registry (registry.fly.io) is built by importing Docker's registry code as a Go library, with custom extensions:

Macaroon Auth

Bearer token authorization via Fly API using Macaroon-based tokens

Security

Multi-Tenant

Repository name rewriting for multi-tenancy, scoped per organization

Build

Cross-Repo Mounts

Cross-repository blob mount handling for efficient layer sharing across apps in the same org

Build
11

Security & LiteFS

Macaroon Tokens

Fly.io uses Macaroon tokens — chained-HMAC bearer tokens based on Google Research's Macaroons paper — for API authentication and authorization. Users can locally attenuate (scope down) any existing token without server involvement.

Client-Side Attenuation

Scope any token from org-wide down to a specific command on a single Machine — no server round-trip needed

Distributed Verification

Signature verification runs on a globally distributed LiteFS-backed cluster of verifiers

Open Source

Available at github.com/superfly/macaroon — separation of authentication and authorization tokens

LiteFS — Distributed SQLite

LiteFS is Fly.io's open-source distributed file system for replicating SQLite databases across a cluster of Machines. It extends Litestream with fine-grained transactional control.

LiteFS Replication Architecture
graph TD
    subgraph Primary["Primary Node"]
        APP1["Application"]
        FUSE1["FUSE Filesystem
(LiteFS)"] DB1["SQLite DB"] end subgraph Replica1["Replica Node"] APP2["Application"] FUSE2["FUSE Filesystem
(LiteFS)"] DB2["SQLite DB"] end LTX["LTX Files
(Changed Pages)"] CLOUD["LiteFS Cloud
(Backups)"] APP1 -->|"Write"| FUSE1 FUSE1 --> DB1 DB1 -->|"LTX"| LTX LTX -->|"Replicate"| FUSE2 FUSE2 --> DB2 APP2 -->|"Read"| FUSE2 LTX --> CLOUD style APP1 fill:#D4220A,stroke:#E8650A,color:#E8DDD0 style DB1 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style DB2 fill:#241F1C,stroke:#F0A500,color:#E8DDD0 style FUSE1 fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style FUSE2 fill:#1A1614,stroke:#3A7D7B,color:#E8DDD0 style LTX fill:#1A1614,stroke:#E8650A,color:#E8DDD0 style CLOUD fill:#1A1614,stroke:#BFA58A,color:#E8DDD0 style APP2 fill:#241F1C,stroke:#BFA58A,color:#E8DDD0
12

Technology Reference

Technology Stack

Language Components
Rust Fly Proxy, Corrosion, VM init process (/init)
Go flyd (orchestrator), flaps (API), flyctl (CLI), registry, Stolon
SQLite Corrosion (cluster state), LiteFS (app databases), BoltDB (flyd local state)
WireGuard Inter-datacenter backhaul, 6PN private networking, client VPN
Linux KVM Firecracker microVM hardware isolation
NATS Internal messaging, coordination, log streaming (JetStream)
Linux LVM Volume thin provisioning, copy-on-write snapshots
eBPF Network access control, static routing on 6PN mesh
QUIC Corrosion node-to-node transport
Macaroons API token authentication and authorization

Acronyms & Abbreviations

6PNIPv6 Private Network
BGPBorder Gateway Protocol
CRDTConflict-free Replicated Data Type
eBPFExtended Berkeley Packet Filter
FUSEFilesystem in Userspace
HAHigh Availability
HMACHash-based Message Authentication Code
KVMKernel-based Virtual Machine
LTXLite Transaction File
LVMLogical Volume Manager
LWWLast-Write-Wins
NATSNeural Autonomic Transport System
NVMeNon-Volatile Memory Express
OCIOpen Container Initiative
OSPFOpen Shortest Path First
QUICQuick UDP Internet Connections
RTTRound-Trip Time
SWIMScalable Weakly-consistent Infection-style Membership
TLSTransport Layer Security
WALWrite-Ahead Log
Diagram
100%
Scroll to zoom · Drag to pan · Esc to close