Architecture Maps

Discord Architecture

Interactive architecture map of Discord's real-time messaging infrastructure — WebSocket gateway, voice servers, database migrations, and the Elixir-Rust performance story, compiled from engineering blog posts and public sources.

Public Sources Only 12M+ Concurrent Users 26M WS Events/sec Updated: Mar 2026
01

Technology Stack Overview

Discord's backend is a polyglot architecture built on Elixir/BEAM for real-time messaging, Python for the HTTP API, Rust for performance-critical data services, and C++ for voice/video media processing. A 5-person chat infrastructure team manages ~20 Elixir microservices across 400-500 machines.

12M+
Concurrent Users
26M
WS Events / sec
6.7M
Active Guilds
~20
Elixir Services
850+
Voice Servers
High-Level System Architecture
graph TD
    subgraph Clients["Client Layer"]
        WEB["Web Client
(Browser WebRTC)"] DESK["Desktop Client
(C++ Engine)"] MOB["Mobile Client
(C++ Engine)"] end subgraph Gateway["Gateway Layer · Elixir"] GW["WebSocket Gateway
(Cowboy + GenStage)"] PUBSUB["Pub/Sub Event Bus"] end subgraph Services["Application Services"] API["REST API
(Python Monolith)"] RS["Read States
(Rust + Tokio)"] DS["Data Services
(Rust + Coalescing)"] VS["Voice Signaling
(Elixir)"] end subgraph Media["Media Layer"] SFU["Voice SFU
(C++ / DAVE E2EE)"] MP["Media Proxy
(Rust + Lilliput)"] end subgraph Data["Data Layer"] SCYLLA["ScyllaDB
(Messages)"] ES["Elasticsearch
(Search Index)"] CASS["Cassandra
(Read States)"] REDIS["Redis
(Cache)"] end Clients --> GW GW --> PUBSUB Clients --> API API --> DS DS --> SCYLLA DS --> ES RS --> CASS GW --> VS VS --> SFU MP --> REDIS style WEB fill:#1a1a28,stroke:#5865F2,color:#fff style DESK fill:#1a1a28,stroke:#5865F2,color:#fff style MOB fill:#1a1a28,stroke:#5865F2,color:#fff style GW fill:#12121c,stroke:#aa00ff,color:#fff style PUBSUB fill:#12121c,stroke:#aa00ff,color:#fff style API fill:#12121c,stroke:#ffee00,color:#fff style RS fill:#12121c,stroke:#ff6600,color:#fff style DS fill:#12121c,stroke:#ff6600,color:#fff style VS fill:#12121c,stroke:#aa00ff,color:#fff style SFU fill:#12121c,stroke:#0066ff,color:#fff style MP fill:#12121c,stroke:#ff6600,color:#fff style SCYLLA fill:#12121c,stroke:#00ff66,color:#fff style ES fill:#12121c,stroke:#00ff66,color:#fff style CASS fill:#12121c,stroke:#00ff66,color:#fff style REDIS fill:#12121c,stroke:#00ff66,color:#fff

Elixir / BEAM

Core real-time backbone. ~20 microservices using Distributed Erlang with partial mesh topology. etcd for service discovery.

Elixir

Rust

Performance-critical services: Read States, Data Services (DB proxy), Media Proxy, game SDK, Go Live video capture, and Elixir NIFs.

Rust

Python

Powers the HTTP REST API monolith handling CRUD operations for guilds, channels, users, and messages.

Python

C++

Voice/video SFU media engine, native client audio engine, and ScyllaDB itself. Custom engine bypasses OS audio ducking.

C++
02

WebSocket Gateway

The gateway is the backbone of Discord's real-time event system. Every active client maintains a persistent WebSocket connection. The gateway pushes messages, presence updates, and typing indicators without polling, using GenStage for back-pressure and load-shedding.

Gateway Event Flow
graph LR
    subgraph Clients["Connected Clients"]
        C1["Client Shard 0"]
        C2["Client Shard 1"]
        C3["Client Shard N"]
    end

    subgraph GW["Gateway Servers · Elixir"]
        COW["Cowboy
WebSocket/TCP"] GS["GenStage
Back-Pressure"] end subgraph Events["Event Sources"] MSG["New Messages"] PRES["Presence Changes"] TYPE["Typing Indicators"] end subgraph Push["Push Pipeline"] PC["Push Collector
(1 proc/machine)"] PUSHER["Pusher Consumers
(demand: 100)"] XMPP["Firebase XMPP"] end Events --> GS GS --> COW COW --> Clients Events --> PC PC --> PUSHER PUSHER --> XMPP style C1 fill:#1a1a28,stroke:#5865F2,color:#fff style C2 fill:#1a1a28,stroke:#5865F2,color:#fff style C3 fill:#1a1a28,stroke:#5865F2,color:#fff style COW fill:#12121c,stroke:#aa00ff,color:#fff style GS fill:#12121c,stroke:#aa00ff,color:#fff style MSG fill:#12121c,stroke:#00f0ff,color:#fff style PRES fill:#12121c,stroke:#00f0ff,color:#fff style TYPE fill:#12121c,stroke:#00f0ff,color:#fff style PC fill:#12121c,stroke:#ff00aa,color:#fff style PUSHER fill:#12121c,stroke:#ff00aa,color:#fff style XMPP fill:#12121c,stroke:#ff00aa,color:#fff
Compression Optimization

Discord migrated gateway compression from zlib to Zstandard, achieving a 40% reduction in bandwidth usage across all WebSocket connections.

Push Notification Architecture

Push notifications use a two-stage GenStage pipeline. The Push Collector (1 Erlang process per machine) buffers requests, while Pusher consumers demand exactly 100 at a time. Firebase XMPP is used instead of HTTP because XMPP enforces a 100-pending-request limit per connection, providing natural backpressure. The system handles bursts of 1M+ push requests per minute via load-shedding when the buffer fills.

03

Guild Sharding

Each Discord guild (server) is represented as a stateful Elixir GenServer process distributed across the cluster using a hash ring. Guilds are the atomic unit and cannot be further partitioned. BEAM supervision handles process crashes and node failures.

Guild Process Model
graph TD
    subgraph Cluster["Elixir Cluster (Hash Ring)"]
        N1["Node A"]
        N2["Node B"]
        N3["Node C"]
    end

    subgraph Guilds["Guild GenServers"]
        G1["Guild 1
(GenServer)"] G2["Guild 2
(GenServer)"] G3["Guild 3
(GenServer)"] G4["Guild 4
(GenServer)"] end subgraph Bot["Bot Client Shards"] S1["Shard 0
(guilds 0-2499)"] S2["Shard 1
(guilds 2500-4999)"] end N1 --> G1 N1 --> G2 N2 --> G3 N3 --> G4 S1 --> N1 S1 --> N2 S2 --> N3 style N1 fill:#12121c,stroke:#aa00ff,color:#fff style N2 fill:#12121c,stroke:#aa00ff,color:#fff style N3 fill:#12121c,stroke:#aa00ff,color:#fff style G1 fill:#1a1a28,stroke:#5865F2,color:#fff style G2 fill:#1a1a28,stroke:#5865F2,color:#fff style G3 fill:#1a1a28,stroke:#5865F2,color:#fff style G4 fill:#1a1a28,stroke:#5865F2,color:#fff style S1 fill:#12121c,stroke:#00f0ff,color:#fff style S2 fill:#12121c,stroke:#00f0ff,color:#fff
Sharding Limitation

Socket connections are held by processes, so moving guild processes between nodes means disconnecting users. This makes flexible repartitioning nearly impossible without user disruption.

Parameter Value Notes
Shard formula (guild_id >> 22) % num_shards Derives from Snowflake ID structure
Bot shard limit 2,500 guilds/shard Enforced by Discord
Recommended ~1,000 guilds/shard For optimal performance
Connection model 1 WS per shard Each shard maintains its own gateway connection
04

Elixir + Rust NIFs

Discord uses Rust NIFs (Native Implemented Functions) via Rustler to accelerate hot paths within the BEAM VM. The member list sorted insertion problem drove the first major adoption, achieving a 160x improvement over pure Elixir at scale.

Rust NIF Integration via Rustler
graph LR
    subgraph BEAM["BEAM VM (Elixir)"]
        GUILD["Guild GenServer"]
        MOD["SortedSet Module
(Elixir Interface)"] end subgraph NIF["Rust NIF Layer"] RUSTLER["Rustler
(Safe Bindings)"] SORTED["SortedSet
(Rust BTreeSet)"] end subgraph Perf["Performance at 1M Items"] BEST["Best: 0.61 us"] WORST["Worst: 3.68 us"] end GUILD --> MOD MOD --> RUSTLER RUSTLER --> SORTED SORTED --> Perf style GUILD fill:#12121c,stroke:#aa00ff,color:#fff style MOD fill:#12121c,stroke:#aa00ff,color:#fff style RUSTLER fill:#12121c,stroke:#ff6600,color:#fff style SORTED fill:#12121c,stroke:#ff6600,color:#fff style BEST fill:#0a0a0f,stroke:#00ff66,color:#00ff66 style WORST fill:#0a0a0f,stroke:#00ff66,color:#00ff66

The Member List Problem

Guilds with 100,000+ members need sorted member lists. Updating a list when a member joins requires a sorted insertion that reports the index. Pure Elixir solutions (MapSet, ordsets, custom skip-list Cells) topped out at 27,000 microseconds worst-case for 250K items.

The Rust SortedSet NIF handles 1,000,000 items with sub-4 microsecond worst-case latency. All operations stay under 1ms, eliminating the need for BEAM reductions or yielding. The NIF appears as a regular Elixir module to callers and powers every single Discord guild's member list.

Solution 250K Best 250K Worst Language
MapSet 31,644 us 57,580 us Elixir
:ordsets 20,438 us 27,390 us Elixir
Rust SortedSet (250K) 0.4 us 1.2 us Rust
Rust SortedSet (1M) 0.61 us 3.68 us Rust
05

Read States: Go to Rust

Read States tracks which channels and messages each user has read. Accessed on every connection, message send, and read action. The Go implementation suffered from garbage collector latency spikes every 2 minutes; the Rust rewrite eliminated them entirely.

Read States Service Architecture
graph TD
    subgraph Clients["Incoming Requests"]
        CONN["Connection Events"]
        SEND["Message Sends"]
        READ["Read Actions"]
    end

    subgraph Service["Read States Service · Rust"]
        TOKIO["Tokio Async Runtime"]
        LRU["BTreeMap LRU Cache
(8M states/node)"] end subgraph Persist["Persistence"] EVICT["Immediate Eviction
Commit"] SCHED["Scheduled Commit
(30s window)"] CASS["Cassandra"] end Clients --> TOKIO TOKIO --> LRU LRU --> EVICT LRU --> SCHED EVICT --> CASS SCHED --> CASS style CONN fill:#1a1a28,stroke:#5865F2,color:#fff style SEND fill:#1a1a28,stroke:#5865F2,color:#fff style READ fill:#1a1a28,stroke:#5865F2,color:#fff style TOKIO fill:#12121c,stroke:#ff6600,color:#fff style LRU fill:#12121c,stroke:#ff6600,color:#fff style EVICT fill:#12121c,stroke:#00ff66,color:#fff style SCHED fill:#12121c,stroke:#00ff66,color:#fff style CASS fill:#12121c,stroke:#00ff66,color:#fff
The Go GC Problem

Go's garbage collector ran every 2 minutes, scanning the entire LRU cache to check for unreferenced memory. This caused periodic latency spikes proportional to cache size. Reducing cache size lowered spike magnitude but increased cache misses -- a lose-lose tradeoff.

Rust Solution

Rust's ownership-based memory model means evicted items are immediately freed with no GC scanning. Average response time dropped to microseconds, capacity increased to 8 million Read States per node, and all latency spikes were eliminated. Built on Tokio async runtime with BTreeMap for memory efficiency.

06

Message Storage

Discord's message storage evolved from MongoDB (2015) to Cassandra (2017) to ScyllaDB (post-2022). The Rust Data Services layer sits between the API and database, providing request coalescing and consistent hash routing for cache locality.

Storage Evolution Timeline
graph LR
    M2015["2015
MongoDB
(Initial)"] C2017["2017
Cassandra
12 nodes"] C2022["2022
Cassandra
177 nodes"] S2023["Post-2022
ScyllaDB
72 nodes"] M2015 --> C2017 C2017 --> C2022 C2022 --> S2023 style M2015 fill:#1a1a28,stroke:#6a6a80,color:#b8b8cc style C2017 fill:#1a1a28,stroke:#ffee00,color:#fff style C2022 fill:#1a1a28,stroke:#ff0044,color:#fff style S2023 fill:#1a1a28,stroke:#00ff66,color:#fff
Data Services Layer (Rust)
graph TD
    subgraph API["API Layer"]
        REST["Python REST API"]
    end

    subgraph DS["Data Services · Rust"]
        ROUTER["Consistent Hash Router
(channel_id routing)"] COAL["Request Coalescing
(deduplicate concurrent reads)"] end subgraph DB["ScyllaDB Cluster"] N1["Node 1
(9TB)"] N2["Node 2
(9TB)"] N3["Node 3
(9TB)"] end REST --> ROUTER ROUTER --> COAL COAL --> N1 COAL --> N2 COAL --> N3 style REST fill:#12121c,stroke:#ffee00,color:#fff style ROUTER fill:#12121c,stroke:#ff6600,color:#fff style COAL fill:#12121c,stroke:#ff6600,color:#fff style N1 fill:#12121c,stroke:#00ff66,color:#fff style N2 fill:#12121c,stroke:#00ff66,color:#fff style N3 fill:#12121c,stroke:#00ff66,color:#fff
60%
Node Reduction
15ms
P99 Read Latency
9 Days
Migration Time
3.2M/s
Migration Throughput

Data Model

Messages are partitioned by channel_id combined with static time buckets. Each message uses a Snowflake ID (chronologically sortable, embeds timestamp) and is replicated across 3 nodes. The migration from 177 Cassandra nodes to 72 ScyllaDB nodes was executed using a custom Rust migrator with SQLite checkpointing, completing in 9 days instead of the estimated 3 months.

Metric Cassandra ScyllaDB
P99 Read Latency 40-125ms 15ms
P99 Write Latency 5-70ms 5ms (steady)
Cluster Size 177 nodes 72 nodes
Disk per Node -- 9 TB
08

Voice Architecture

Three backend services power voice: the Discord Gateway (WebSocket events), Discord Guilds (voice server assignment and state), and Discord Voice (signaling + SFU). The homegrown C++ SFU handles 2.6M concurrent voice users across 850+ servers in 13 regions.

Voice Connection Flow
graph TD
    subgraph Client["Client"]
        CWEB["Browser
(Native WebRTC)"] CNAT["Desktop/Mobile
(Custom C++ Engine)"] end subgraph Control["Control Plane"] GW["Discord Gateway
(WebSocket)"] GUILDS["Discord Guilds
(Voice State)"] SIG["Voice Signaling
(Keys + Stream IDs)"] end subgraph Media["Media Plane"] SFU["C++ SFU
(Selective Forwarding)"] ENC["Salsa20 Encryption
+ DAVE E2EE"] OPUS["Opus Codec
48kHz Stereo"] end Client --> GW GW --> GUILDS GUILDS --> SIG SIG --> SFU CWEB --> SFU CNAT --> SFU SFU --> ENC SFU --> OPUS style CWEB fill:#1a1a28,stroke:#5865F2,color:#fff style CNAT fill:#1a1a28,stroke:#5865F2,color:#fff style GW fill:#12121c,stroke:#aa00ff,color:#fff style GUILDS fill:#12121c,stroke:#aa00ff,color:#fff style SIG fill:#12121c,stroke:#aa00ff,color:#fff style SFU fill:#12121c,stroke:#0066ff,color:#fff style ENC fill:#12121c,stroke:#ff0044,color:#fff style OPUS fill:#12121c,stroke:#00f0ff,color:#fff

Protocol Optimizations

Discord replaces standard SDP signaling (~10KB) with a minimal ~1000 byte payload containing only server address, encryption method, codec, and stream ID. ICE negotiation is skipped entirely since all clients connect through relay servers, which also hides user IPs. DTLS/SRTP encryption is replaced with Salsa20 for performance. During silent periods, no audio is transmitted, requiring sequence number rewriting.

DAVE Protocol (E2EE, 2024-2026)

End-to-end encryption for DMs, group DMs, voice channels, and Go Live streams, enforced for all non-Stage voice calls since March 2, 2026. Uses WebRTC Encoded Transforms + Messaging Layer Security (MLS) for group key exchange with epoch-based rotation when participants join or leave. The protocol is open-source and externally audited by Trail of Bits.

850+
Voice Servers
13
Regions
2.6M
Concurrent Users
220+
Gbps Egress
09

Bot & API Platform

Discord exposes two API surfaces: the HTTP REST API (Python monolith for CRUD) and the WebSocket Gateway (Elixir for real-time events). Bots can receive interactions via persistent gateway connection or outgoing webhooks to a configured URL, enabling serverless architectures.

API Interaction Models
graph TD
    subgraph Discord["Discord Platform"]
        REST["HTTP REST API
(Python)"] GWAPI["WebSocket Gateway
(Elixir)"] end subgraph Gateway["Gateway Bot"] GBOT["Bot Process
(Persistent WS)"] GINT["INTERACTION_CREATE
Event"] end subgraph Webhook["Webhook Bot"] WURL["Interactions Endpoint
(Configured URL)"] LAMB["Lambda / Serverless"] end subgraph Limits["Rate Limiting"] ROUTE["Per-Route
(X-RateLimit-Bucket)"] GLOBAL["Global
(50 req/sec)"] INVALID["Invalid Request
(10K/10min ban)"] end REST --> Gateway GWAPI --> GBOT GBOT --> GINT REST --> Webhook REST --> WURL WURL --> LAMB REST --> Limits style REST fill:#12121c,stroke:#ffee00,color:#fff style GWAPI fill:#12121c,stroke:#aa00ff,color:#fff style GBOT fill:#1a1a28,stroke:#5865F2,color:#fff style GINT fill:#1a1a28,stroke:#5865F2,color:#fff style WURL fill:#1a1a28,stroke:#00f0ff,color:#fff style LAMB fill:#1a1a28,stroke:#00f0ff,color:#fff style ROUTE fill:#0a0a0f,stroke:#ff6600,color:#ff6600 style GLOBAL fill:#0a0a0f,stroke:#ff6600,color:#ff6600 style INVALID fill:#0a0a0f,stroke:#ff0044,color:#ff0044
Rate Limit Tier Scope Limit
Per-Route Endpoint + method, scoped by guild/channel/webhook Varies (X-RateLimit-Bucket header)
Global All endpoints per bot 50 requests/second
Invalid Requests 401/403/429 responses per IP 10,000 per 10 minutes (then temp ban)
10

CDN & Media Pipeline

Discord serves media through two domains: cdn.discordapp.com for static originals and media.discordapp.net for the Rust Media Proxy that inspects, converts, and resizes every attachment and embedded image on the fly. The open-source Lilliput library handles image processing with WebP, AVIF, and GIF support.

Media Processing Pipeline
graph LR
    subgraph Upload["Upload"]
        CLIENT["Client Upload"]
    end

    subgraph Process["Media Proxy · Rust"]
        DETECT["Detection
(format, animation)"] TRANSFORM["Transformation
(resize, convert)"] OPTIMIZE["Optimization
(WebP/AVIF)"] end subgraph Delivery["Delivery"] CDN["cdn.discordapp.com
(Static Originals)"] MEDIA["media.discordapp.net
(Processed Assets)"] end CLIENT --> DETECT DETECT --> TRANSFORM TRANSFORM --> OPTIMIZE OPTIMIZE --> CDN OPTIMIZE --> MEDIA style CLIENT fill:#1a1a28,stroke:#5865F2,color:#fff style DETECT fill:#12121c,stroke:#ff6600,color:#fff style TRANSFORM fill:#12121c,stroke:#ff6600,color:#fff style OPTIMIZE fill:#12121c,stroke:#ff6600,color:#fff style CDN fill:#12121c,stroke:#00f0ff,color:#fff style MEDIA fill:#12121c,stroke:#00f0ff,color:#fff

WebP

8-bit color (16M colors), superior compression, near-universal browser support. 29% median size reduction over GIF for animated emojis.

Primary format

AVIF

Up to 12-bit color, HDR support, advanced compression. HDR content is tone-mapped to SDR when converting to 8-bit formats.

Supported

GIF

Legacy format retained for compatibility. 95%+ animated emoji requests now served as WebP instead.

Legacy
29%
Median Size Reduction
42.5%
P95 Size Reduction
95%+
Emoji as WebP
60fps
Animated Emoji Display
Accessibility

The is_animated flag is propagated throughout all API systems and respects the user's Reduced Motion accessibility setting, ensuring animated content can be paused for users who need it.

11

Acronym Reference

AVIFAV1 Image File Format
BEAMBogdan/Björn's Erlang Abstract Machine
BFGBig Freaking Guild
CDNContent Delivery Network
DAVEDiscord Audio/Video End-to-End Encryption
DMDirect Message
DTLSDatagram Transport Layer Security
E2EEEnd-to-End Encryption
ECKElastic Cloud on Kubernetes
ESElasticsearch
GCGarbage Collector
HDRHigh Dynamic Range
ICEInteractive Connectivity Establishment
K8sKubernetes
LRULeast Recently Used
MLSMessaging Layer Security
NATNetwork Address Translation
NIFNative Implemented Function
RTCPRTP Control Protocol
SDPSession Description Protocol
SDRStandard Dynamic Range
SFUSelective Forwarding Unit
SRTPSecure Real-time Transport Protocol
WebPWeb Picture Format (Google)
WebRTCWeb Real-Time Communication
WSWebSocket
XMPPExtensible Messaging and Presence Protocol
Diagram
100%
Scroll to zoom / Drag to pan / Esc to close