Interactive architecture map of Stripe's payment infrastructure — API design, idempotency, fraud detection, and the Ruby monolith compiled from publicly available engineering sources.
Stripe processes payments for millions of businesses worldwide. The platform is built on a Ruby monolith with approximately 3,000 engineers across 360 teams, targeting 99.99995% availability. Every API version since 2011 remains backwards compatible.
graph TD
subgraph Clients["Client Layer"]
SDK["SDKs
(Ruby, Python, JS, Go)"]
ELEM["Stripe Elements
& Checkout"]
DASH["Dashboard
& CLI"]
end
subgraph Core["Core Platform"]
GW["API Gateway"]
MONO["Ruby Monolith
(Sorbet-typed)"]
PAY["Payment
Processor"]
LEDGER["Ledger
(double-entry)"]
end
subgraph Services["Supporting Services"]
RADAR["Radar
(fraud ML)"]
WEBHOOK["Webhook
Delivery"]
CONNECT["Connect
(multi-party)"]
CDV["Card Data Vault
(PCI Level 1)"]
end
subgraph Infra["Infrastructure"]
AWS["AWS"]
DOCDB["DocDB
(MongoDB)"]
PROM["Prometheus
& Grafana"]
end
Clients --> GW
GW --> MONO
MONO --> PAY
MONO --> LEDGER
MONO --> RADAR
MONO --> WEBHOOK
MONO --> CONNECT
PAY --> CDV
Core --> Infra
style GW fill:#000,stroke:#ff0000,color:#fff
style MONO fill:#000,stroke:#ff0000,color:#fff
style PAY fill:#171717,stroke:#525252,color:#fff
style LEDGER fill:#171717,stroke:#525252,color:#fff
style RADAR fill:#171717,stroke:#dc2626,color:#fff
style WEBHOOK fill:#171717,stroke:#ca8a04,color:#fff
style CONNECT fill:#171717,stroke:#ea580c,color:#fff
style CDV fill:#171717,stroke:#0891b2,color:#fff
style DOCDB fill:#262626,stroke:#9333ea,color:#fff
style AWS fill:#262626,stroke:#9333ea,color:#fff
style PROM fill:#262626,stroke:#9333ea,color:#fff
Stripe has maintained backwards compatibility with every API version since 2011. The versioning system uses date-based rolling versions rather than semver. Major releases ship twice a year; monthly updates between majors are always backwards-compatible.
Engineers write code only against the latest API version. A separate versioning layer transforms responses backwards through time by applying version change modules sequentially.
graph LR
REQ["Incoming
Request"] --> AUTH["Authentication
& Rate Limiting"]
AUTH --> VER["Determine
Target Version"]
VER --> LOGIC["Core Business
Logic"]
LOGIC --> RESP["Generate Response
(latest schema)"]
RESP --> VCM["Version Change
Module Chain"]
VCM --> DELIVER["Deliver
Response"]
subgraph Versioning["Version Change Modules"]
VCM
V1["2024-09-30
acacia"]
V2["2024-04-10
..."]
V3["2023-10-16
..."]
VN["2017-05-24
..."]
end
VCM --> V1
V1 --> V2
V2 --> V3
V3 --> VN
style REQ fill:#000,stroke:#ff0000,color:#fff
style LOGIC fill:#000,stroke:#ff0000,color:#fff
style VCM fill:#171717,stroke:#2563eb,color:#fff
style V1 fill:#262626,stroke:#525252,color:#e5e5e5
style V2 fill:#262626,stroke:#525252,color:#e5e5e5
style V3 fill:#262626,stroke:#525252,color:#e5e5e5
style VN fill:#262626,stroke:#525252,color:#e5e5e5
style AUTH fill:#171717,stroke:#525252,color:#fff
style VER fill:#171717,stroke:#525252,color:#fff
style RESP fill:#171717,stroke:#525252,color:#fff
style DELIVER fill:#171717,stroke:#525252,color:#fff
The first time a user makes an API request, their account is pinned to the current version. Users can override per-request via the Stripe-Version header or upgrade through the dashboard.
Each module specifies: documentation of the change, a transformation function, and which API resource types it applies to.
New API resources, new optional request parameters, and new properties on existing responses can ship without versioning.
Version modules power automatic changelog generation, personalized API docs, and field-level change tracking.
All mutating endpoints (POST) support idempotency via the Idempotency-Key HTTP header. The system uses an atomic phases pattern that divides request lifecycles into discrete phases separated by recovery points.
graph TD
START["Client Request
+ Idempotency Key"] --> P1["Phase 1 (ACID tx)
Upsert idempotency
key record"]
P1 --> P2["Phase 2 (ACID tx)
Create local records
(ride, audit logs)"]
P2 --> FM["Foreign Mutation
Call payment
processor"]
FM --> P3["Phase 3 (ACID tx)
Update with charge ID
Stage background jobs"]
P3 --> DONE["Recovery Point:
finished"]
P1 -.->|"recovery: started"| RP1["RP: started"]
P2 -.->|"recovery: ride_created"| RP2["RP: ride_created"]
FM -.->|"recovery: charge_created"| RP3["RP: charge_created"]
style START fill:#000,stroke:#ff0000,color:#fff
style DONE fill:#000,stroke:#ff0000,color:#fff
style P1 fill:#171717,stroke:#525252,color:#fff
style P2 fill:#171717,stroke:#525252,color:#fff
style P3 fill:#171717,stroke:#525252,color:#fff
style FM fill:#171717,stroke:#dc2626,color:#fff
style RP1 fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style RP2 fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style RP3 fill:#262626,stroke:#a3a3a3,color:#e5e5e5
On retry, the server jumps to the last recovery point and resumes. Finished requests return the cached response immediately. Keys expire after 24 hours; the Reaper deletes records after ~72 hours.
Finds abandoned incomplete requests and pushes them to completion automatically.
Deletes idempotency key records after approximately 72 hours of inactivity.
Moves jobs from the staged_jobs table to worker queues only after transaction commit.
| Error Type | Action | HTTP Code |
|---|---|---|
| Recoverable (timeout) | Unlock key, allow retries | 5xx |
| Unrecoverable (invalid card) | Set recovery point to finished with error | 4xx |
| Concurrent conflict | SERIALIZABLE isolation rejects duplicate | 409 |
The PaymentIntent is the central stateful object tracking a customer's attempt to pay. It combines a PaymentMethod (the "how") with the payment amount and currency (the "what"), progressing through seven states.
graph TD
RPM["requires_payment
_method"] --> RC["requires_
confirmation"]
RC --> RA["requires_
action"]
RC --> PROC["processing"]
RA --> PROC
PROC --> SUC["succeeded"]
PROC --> RPM
RC --> RCAP["requires_
capture"]
RCAP --> PROC
RCAP --> RPM
RPM --> CAN["canceled"]
RC --> CAN
RA --> CAN
RCAP --> CAN
style RPM fill:#000,stroke:#ff0000,color:#fff
style SUC fill:#171717,stroke:#16a34a,color:#fff
style CAN fill:#171717,stroke:#dc2626,color:#fff
style RC fill:#262626,stroke:#525252,color:#e5e5e5
style RA fill:#262626,stroke:#ca8a04,color:#e5e5e5
style PROC fill:#262626,stroke:#525252,color:#e5e5e5
style RCAP fill:#262626,stroke:#525252,color:#e5e5e5
Behind the state machine, the payment pipeline coordinates multiple internal services from gateway to ledger.
graph LR
GW["API Gateway
Auth, rate limit,
idempotency"] --> PS["Payments
Service
State machine"]
PS --> TP["Transaction
Processor
Business logic"]
TP --> RADAR["Radar
Fraud scoring"]
TP --> CDV["Card Data
Vault"]
TP --> LED["Ledger
Double-entry
bookkeeping"]
PS --> WH["Notification
System
Webhooks"]
style GW fill:#000,stroke:#ff0000,color:#fff
style PS fill:#171717,stroke:#525252,color:#fff
style TP fill:#171717,stroke:#525252,color:#fff
style RADAR fill:#171717,stroke:#dc2626,color:#fff
style CDV fill:#171717,stroke:#0891b2,color:#fff
style LED fill:#171717,stroke:#16a34a,color:#fff
style WH fill:#171717,stroke:#ca8a04,color:#fff
Stripe delivers event notifications via webhooks with exponential backoff retries over 3 days in live mode. Each delivery is signed with HMAC-SHA256 to prevent replay attacks.
graph TD
EVT["Payment Event
Triggered"] --> SIG["Generate HMAC-SHA256
Signature + Timestamp"]
SIG --> DEL["Deliver to
Endpoint"]
DEL -->|"2xx"| OK["Delivery
Confirmed"]
DEL -->|"Error/Timeout"| RETRY["Exponential
Backoff Queue"]
RETRY --> RESIG["New Signature
+ Timestamp"]
RESIG --> DEL
RETRY -->|"After 3 days"| FAIL["Mark
Failed"]
style EVT fill:#000,stroke:#ff0000,color:#fff
style SIG fill:#171717,stroke:#ca8a04,color:#fff
style DEL fill:#171717,stroke:#525252,color:#fff
style OK fill:#171717,stroke:#16a34a,color:#fff
style RETRY fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style RESIG fill:#262626,stroke:#ca8a04,color:#e5e5e5
style FAIL fill:#171717,stroke:#dc2626,color:#fff
Immediately, then 5 min, 30 min, 2 hours, 5 hours, 10 hours, then every 12 hours for up to 3 days. Sandbox webhooks retry only 3 times over a few hours. Each retry generates a new signature — the signature is specific to each delivery attempt.
Handlers must be idempotent because duplicate deliveries are possible during retries.
The Stripe-Signature header contains a timestamp and HMAC-SHA256 hash. The timestamp is part of the signed payload, preventing replay attacks.
Handlers should return 2xx quickly and process the event asynchronously to avoid timeout retries.
Stripe's core product runs in a single Ruby monolith — over 15 million lines of code across 150,000 files. The Sorbet type system, written in C++, made it possible to maintain this monolith at scale rather than being forced into premature microservice decomposition.
graph TD
subgraph Codebase["Ruby Monolith (15M LOC)"]
SRC["150,000
Ruby Files"]
end
subgraph Sorbet["Sorbet (C++ Binary)"]
PARSE["Parser
& Resolver"]
LOCAL["Local Type
Inference"]
MT["Multithreaded
Checker"]
PRESER["Pre-serialized
stdlib defs"]
end
subgraph Output["Results"]
ERR["Error Reports
(ms for 80% of edits)"]
IDE["IDE Integration
(LSP)"]
end
SRC --> PARSE
PARSE --> LOCAL
LOCAL --> MT
PRESER --> PARSE
MT --> ERR
MT --> IDE
style SRC fill:#000,stroke:#ff0000,color:#fff
style PARSE fill:#171717,stroke:#4f46e5,color:#fff
style LOCAL fill:#171717,stroke:#4f46e5,color:#fff
style MT fill:#171717,stroke:#4f46e5,color:#fff
style PRESER fill:#262626,stroke:#525252,color:#e5e5e5
style ERR fill:#262626,stroke:#525252,color:#e5e5e5
style IDE fill:#262626,stroke:#525252,color:#e5e5e5
| Strictness Level | Behavior |
|---|---|
typed: false |
Sorbet ignores the file entirely |
typed: true |
Checks method calls and types, allows untyped signatures |
typed: strict |
Requires explicit type signatures on all methods |
By 2017, the most common production failure was NoMethodError. Hundreds of engineers were working in millions of lines of untyped Ruby. Sorbet typechecks at ~100,000 lines/sec per core and reports errors in milliseconds for 80% of edits. Development started November 2017; open-sourced June 2019.
Stripe is certified as PCI Service Provider Level 1 — the most stringent level in the payments industry. The Card Data Vault (CDV) runs in a completely separate hosting environment with no shared credentials.
graph TD
subgraph ClientSide["Client-Side Capture"]
CHECKOUT["Stripe Checkout
(hosted page)"]
ELEMENTS["Stripe Elements
(embedded UI)"]
MOBILE["Mobile SDKs
(iOS, Android)"]
TERMINAL["Terminal SDKs
(in-person)"]
end
subgraph CDVZone["Card Data Vault (Isolated Environment)"]
TOK["Tokenization
Engine"]
ENC["AES-256
Encryption"]
KEYS["Decryption Keys
(separate machines)"]
ALLOW["Provider
Allowlist"]
end
subgraph Main["Primary Stripe Services"]
API["API Server"]
BIZ["Business Logic"]
end
ClientSide -->|"PAN"| TOK
TOK --> ENC
ENC -.-> KEYS
TOK -->|"token only"| API
API --> BIZ
TOK -->|"PAN via allowlist"| ALLOW
ALLOW --> PROC["Payment
Processors"]
style TOK fill:#000,stroke:#0891b2,color:#fff
style ENC fill:#171717,stroke:#0891b2,color:#fff
style KEYS fill:#171717,stroke:#dc2626,color:#fff
style ALLOW fill:#171717,stroke:#0891b2,color:#fff
style API fill:#262626,stroke:#525252,color:#e5e5e5
style BIZ fill:#262626,stroke:#525252,color:#e5e5e5
style PROC fill:#262626,stroke:#525252,color:#e5e5e5
style CHECKOUT fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style ELEMENTS fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style MOBILE fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style TERMINAL fill:#262626,stroke:#a3a3a3,color:#e5e5e5
No internal servers or daemons can obtain plaintext card numbers. Cards can only be sent to service providers on a static allowlist. Decryption keys are stored on separate machines from encrypted data. The CDV has no shared credentials with primary Stripe services.
Radar is Stripe's ML-powered fraud detection system that scores every transaction in under 100 milliseconds using 1,000+ characteristics. The model evolved from a Wide & Deep ensemble to a pure DNN architecture, reducing training time by 85%.
graph LR
TX["Transaction
Enters Pipeline"] --> FE["Feature
Extraction
1,000+ signals"]
FE --> DNN["DNN Model
(ResNeXt-inspired
multi-branch)"]
DNN --> SCORE["Fraud
Probability
Score"]
SCORE -->|"Low risk"| APPROVE["Approve"]
SCORE -->|"Medium risk"| THREEDS["3D Secure
Challenge"]
SCORE -->|"High risk"| BLOCK["Block"]
FE --> EXPLAIN["Feature
Attribution
& Risk Insights"]
style TX fill:#000,stroke:#ff0000,color:#fff
style FE fill:#171717,stroke:#dc2626,color:#fff
style DNN fill:#171717,stroke:#dc2626,color:#fff
style SCORE fill:#171717,stroke:#525252,color:#fff
style APPROVE fill:#171717,stroke:#16a34a,color:#fff
style THREEDS fill:#262626,stroke:#ca8a04,color:#e5e5e5
style BLOCK fill:#262626,stroke:#dc2626,color:#e5e5e5
style EXPLAIN fill:#262626,stroke:#a3a3a3,color:#e5e5e5
Forensic analysis of past fraud attacks, pattern recognition across the Stripe network, and dark web monitoring with weekly team meetings on emerging trends.
Migrated from XGBoost + DNN ensemble to pure DNN-only "Network-in-Neuron" architecture in mid-2022. Currently testing 10x-100x training data increases.
More Stripe merchants means more training data, which means better fraud detection for everyone on the platform.
Stripe Connect enables platforms and marketplaces to route payments between multiple parties. Three charge types support different fund flow topologies, from simple SaaS fee collection to multi-vendor payment splitting.
graph TD
subgraph Direct["Direct Charges"]
DC_CUST["Customer"] -->|"pays"| DC_CA["Connected
Account"]
DC_CA -->|"app fee"| DC_PLAT["Platform"]
end
subgraph Destination["Destination Charges"]
DEST_CUST["Customer"] -->|"pays"| DEST_PLAT["Platform"]
DEST_PLAT -->|"transfers"| DEST_CA["Connected
Account"]
end
subgraph Separate["Separate Charges & Transfers"]
SEP_CUST["Customer"] -->|"pays"| SEP_PLAT["Platform"]
SEP_PLAT -->|"split"| SEP_CA1["Account A"]
SEP_PLAT -->|"split"| SEP_CA2["Account B"]
end
style DC_CUST fill:#262626,stroke:#525252,color:#e5e5e5
style DC_CA fill:#171717,stroke:#ea580c,color:#fff
style DC_PLAT fill:#000,stroke:#ff0000,color:#fff
style DEST_CUST fill:#262626,stroke:#525252,color:#e5e5e5
style DEST_PLAT fill:#000,stroke:#ff0000,color:#fff
style DEST_CA fill:#171717,stroke:#ea580c,color:#fff
style SEP_CUST fill:#262626,stroke:#525252,color:#e5e5e5
style SEP_PLAT fill:#000,stroke:#ff0000,color:#fff
style SEP_CA1 fill:#171717,stroke:#ea580c,color:#fff
style SEP_CA2 fill:#171717,stroke:#ea580c,color:#fff
| Account Type | Control | Dashboard |
|---|---|---|
| Standard | Account holder controls everything | Full Stripe dashboard |
| Express | Platform manages payouts and fund flows | Limited dashboard |
| Custom | Fully white-labeled by platform | None (API only) |
Platform fees are deducted before fund transfer to connected accounts.
Automatic clawback from each party on refunds and chargebacks.
Built-in identity verification flows, either Stripe-hosted or API-based.
Stripe runs on AWS, generating 500 million metrics every 10 seconds across 360 teams. The Core Infrastructure organization abstracts cloud primitives to maintain high reliability at internet scale.
Built on top of MongoDB to handle petabytes of data at 99.999% uptime. DocDB provides sharding, replication, and zero-downtime migrations.
graph TD
APP["Application
Services"] --> PROXY["Database Proxy
Access control,
validation, routing"]
PROXY --> CMS["Chunk Metadata
Service
Data-to-shard map"]
CMS --> S1["Shard 1"]
CMS --> S2["Shard 2"]
CMS --> SN["Shard N"]
S1 --> R1["Replicas
(CDC sync)"]
S2 --> R2["Replicas
(CDC sync)"]
SN --> RN["Replicas
(CDC sync)"]
style APP fill:#000,stroke:#ff0000,color:#fff
style PROXY fill:#171717,stroke:#9333ea,color:#fff
style CMS fill:#171717,stroke:#9333ea,color:#fff
style S1 fill:#262626,stroke:#525252,color:#e5e5e5
style S2 fill:#262626,stroke:#525252,color:#e5e5e5
style SN fill:#262626,stroke:#525252,color:#e5e5e5
style R1 fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style R2 fill:#262626,stroke:#a3a3a3,color:#e5e5e5
style RN fill:#262626,stroke:#a3a3a3,color:#e5e5e5
graph LR
S1["1. Register
& Index"] --> S2["2. Bulk
Import"]
S2 --> S3["3. Async
Replication"]
S3 --> S4["4. Correctness
Verification"]
S4 --> S5["5. Traffic
Switchover"]
S5 --> S6["6. Finalize
& Cleanup"]
style S1 fill:#262626,stroke:#525252,color:#e5e5e5
style S2 fill:#262626,stroke:#525252,color:#e5e5e5
style S3 fill:#262626,stroke:#525252,color:#e5e5e5
style S4 fill:#171717,stroke:#ca8a04,color:#fff
style S5 fill:#171717,stroke:#dc2626,color:#fff
style S6 fill:#262626,stroke:#525252,color:#e5e5e5
Step 5 blocks source writes, replicates pending changes, and updates chunk metadata. The actual downtime window is measured in seconds. Step 4 uses point-in-time snapshot comparison to guarantee correctness before the switch.
Amazon Managed Prometheus for metrics, Amazon Managed Grafana for visualization. Sharded and tiered storage separates hot vs. cold data.
OS components, databases (MongoDB, PostgreSQL, MySQL), HA/DR, container orchestration, mesh networking, service discovery, and change management.
Engineers are expected to own their observability tooling, building dashboards and alerts for their own services.