Uber Architecture Map — Architecture Guide

01

Domain-Oriented Microservice Architecture

Uber grew to approximately 2,200 critical microservices, creating massive complexity where engineers sometimes traced through 50 services across 12 teams for a single investigation. DOMA organizes these into ~70 domains with a strict layered hierarchy.

2,200

Critical Services

~70

Domains

~1.5yr

Service Half-Life

25-50%

Faster Onboarding

DOMA Layer Hierarchy

graph TD
    subgraph Edge["Edge Layer"]
        GW["API Gateway
Mobile-aware routing"]
    end

    subgraph Presentation["Presentation Layer"]
        RIDER["Rider App
Screens"]
        DRIVER["Driver App
Screens"]
        EATER["Eater App
Screens"]
    end

    subgraph Product["Product Layer"]
        RIDES["Rides
Domain"]
        EATS["Eats
Domain"]
        FREIGHT["Freight
Domain"]
    end

    subgraph Business["Business Layer"]
        PAY["Payments"]
        IDENT["Identity"]
        PRICE["Pricing"]
        SAFETY["Safety"]
    end

    subgraph Infra["Infrastructure Layer"]
        KAFKA["Kafka"]
        SCHEMA["Schemaless"]
        RING["Ringpop"]
        OBS["Observability"]
    end

    GW --> Presentation
    Presentation --> Product
    Product --> Business
    Business --> Infra

    style GW fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style RIDER fill:#2A3550,stroke:#5B9EC4,color:#F5E6C8
    style DRIVER fill:#2A3550,stroke:#5B9EC4,color:#F5E6C8
    style EATER fill:#2A3550,stroke:#5B9EC4,color:#F5E6C8
    style RIDES fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style EATS fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style FREIGHT fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style PAY fill:#1B2838,stroke:#C9A84C,color:#F5E6C8
    style IDENT fill:#1B2838,stroke:#C9A84C,color:#F5E6C8
    style PRICE fill:#1B2838,stroke:#C9A84C,color:#F5E6C8
    style SAFETY fill:#1B2838,stroke:#C9A84C,color:#F5E6C8
    style KAFKA fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8
    style SCHEMA fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8
    style RING fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8
    style OBS fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8

Gateway Pattern

Each domain exposes a single gateway service as the entry point. Upstream consumers call only the gateway, which abstracts internal services. Teams can restructure, rename, or migrate internal services without forcing upstream migrations.

Domain Gateway Abstraction

graph LR
    subgraph Upstream["Upstream Consumers"]
        S1["Service A"]
        S2["Service B"]
    end

    subgraph Domain["Payments Domain"]
        DGW["Domain
Gateway"]
        subgraph Internal["Internal Services"]
            I1["Payment
Processing"]
            I2["Billing
Engine"]
            I3["Fraud
Detection"]
            I4["Ledger"]
        end
    end

    S1 --> DGW
    S2 --> DGW
    DGW --> I1
    DGW --> I2
    DGW --> I3
    DGW --> I4

    style DGW fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style S1 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style S2 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style I1 fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8
    style I2 fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8
    style I3 fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8
    style I4 fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8

Extension Architecture

DOMA provides two extension mechanisms to prevent tight coupling:

Logic Extensions

Plugin interfaces where teams register handlers. For example, driver "go online" safety checks iterate through registered validators without coupling to specific implementations.

Reduced feature integration: 3 days to 3 hours

Data Extensions

Protobuf Any type allows attaching arbitrary context to core data models without bloating them. Each domain can enrich messages without modifying shared schemas.

Platform support costs dropped 10x

02

Real-Time Dispatch (DISCO)

DISCO (Dispatch Optimization) matches riders to drivers in real time. It uses geospatial sharding with Google S2 cells (~3km), GPS updates every 4-5 seconds, and combines greedy matching with batch optimization.

DISCO Dispatch Flow

graph TD
    subgraph Clients["Mobile Clients"]
        RAPP["Rider App
(WebSocket)"]
        DAPP["Driver App
(GPS every 4-5s)"]
    end

    subgraph Core["DISCO Core"]
        DEM["Demand
Service"]
        SUP["Supply
Service"]
        DISCO["Dispatch
(DISCO)"]
        LOC["Location
Service"]
    end

    subgraph Data["Data Layer"]
        KFK["Kafka
REST API"]
        MEM["In-Memory
Driver State"]
        RP["Ringpop
Hash Ring"]
    end

    RAPP -->|"ride request"| DEM
    DAPP -->|"GPS coords"| LOC
    LOC --> KFK
    KFK --> MEM
    DEM --> DISCO
    SUP --> DISCO
    DISCO -->|"match"| RAPP
    DISCO -->|"assignment"| DAPP
    MEM --> SUP
    RP --> DISCO

    style DISCO fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style DEM fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style SUP fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style LOC fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style RAPP fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style DAPP fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style KFK fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style MEM fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style RP fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8

Matching Strategy

DISCO combines greedy matching for initial assignments with batch optimization that considers multiple pending requests simultaneously. For UberPOOL, it can revise routes on in-progress trips. Objectives: minimize extra driving, minimize rider wait time, minimize overall ETA.

Geospatial Sharding

Uses Google S2 library to divide the map into ~3km cells, each with a unique cell ID used as a shard key. When a request comes in, DISCO draws a radius around the rider, filters nearby drivers meeting requirements (distance, ETA, ratings, car type), calculates ETAs, and selects the best match. Driver locations are stored in worker node memory for sub-millisecond access.

Technology Stack

Built on Node.js for async/event-driven WebSocket handling. Uses Ringpop for data distribution and cluster membership. Driver locations flow through Kafka REST APIs and are stored in worker node memory.

03

Kafka Backbone

Uber operates one of the world's largest Kafka deployments: trillions of messages and multiple petabytes per day, organized in a two-tier regional/aggregate topology.

Trillions

Messages / Day

PBs

Data / Day

~150

Nodes / Cluster

1,000+

Consumer Services

Two-Tier Kafka Topology

graph TD
    subgraph Region1["Region A"]
        P1["Producers"]
        RC1["Regional
Cluster"]
    end

    subgraph Region2["Region B"]
        P2["Producers"]
        RC2["Regional
Cluster"]
    end

    subgraph Agg["Aggregate Layer"]
        AGG["Aggregate
Cluster"]
        UREP["uReplicator
(cross-region)"]
    end

    subgraph Consumers["Consumer Ecosystem"]
        UFWD["uForwarder
(push-based)"]
        FLINK["Apache Flink"]
        SAMZA["Samza"]
        HDFS["HDFS
Data Lake"]
        PRESTO["Presto SQL"]
    end

    P1 --> RC1
    P2 --> RC2
    RC1 -->|"async replicate"| UREP
    RC2 -->|"async replicate"| UREP
    UREP --> AGG
    AGG --> UFWD
    AGG --> FLINK
    AGG --> SAMZA
    AGG --> HDFS
    AGG --> PRESTO

    style AGG fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style RC1 fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style RC2 fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style UREP fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style P1 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style P2 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style UFWD fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style FLINK fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style SAMZA fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style HDFS fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style PRESTO fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8

Consumer Patterns

Active/Active

Independent consumers in each region process identical topic data. Used by surge pricing for seamless regional failover.

Active/Passive

Single consumer in primary region. Offset management service syncs progress across regions for failover with accurate resumption.

uForwarder

Push-based Kafka consumer proxy. Over 1,000 services onboarded. Features context-aware routing, head-of-line blocking mitigation, and adaptive auto-rebalancing.

Open Source

04

H3 Hexagonal Geospatial Index

H3 is Uber's open-source hierarchical hexagonal geospatial indexing system. It uses gnomonic projections on an icosahedron (20-sided polyhedron) with 64-bit cell identifiers across 16 resolution levels.

H3 Resolution Hierarchy and API

graph TD
    subgraph Grid["Icosahedron Grid"]
        ICO["20 Faces
122 Base Cells"]
        PEN["12 Pentagons
(ocean-positioned)"]
    end

    subgraph Resolution["Resolution Levels 0-15"]
        R0["Level 0
Coarsest"]
        R7["Level 7
~5km2 cells"]
        R15["Level 15
Finest"]
    end

    subgraph API["Key Operations"]
        GEO["geoToH3 /
h3ToGeo"]
        KRING["kRing(idx, k)
Circular region"]
        COMPACT["compact /
uncompact"]
        EDGE["Directed
Edges"]
    end

    subgraph Uses["Uber Applications"]
        SURGE["Surge Pricing
Zones"]
        POOL["UberPOOL
Matching"]
        MARKET["Marketplace
Analytics"]
    end

    ICO --> R0
    PEN --> R0
    R0 -->|"1/7 area"| R7
    R7 -->|"1/7 area"| R15
    R0 --> API
    API --> Uses

    style ICO fill:#1B2838,stroke:#6B8E4E,color:#F5E6C8
    style PEN fill:#1B2838,stroke:#6B8E4E,color:#F5E6C8
    style R0 fill:#2A3550,stroke:#C9A84C,color:#F5E6C8
    style R7 fill:#2A3550,stroke:#C9A84C,color:#F5E6C8
    style R15 fill:#2A3550,stroke:#C9A84C,color:#F5E6C8
    style GEO fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style KRING fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style COMPACT fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style EDGE fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style SURGE fill:#0F1B2A,stroke:#E8A317,color:#F5E6C8
    style POOL fill:#0F1B2A,stroke:#E8A317,color:#F5E6C8
    style MARKET fill:#0F1B2A,stroke:#E8A317,color:#F5E6C8

Why Hexagons?

Each hexagon has 6 neighbors all at equal distance from center. Squares have two distances (edge vs diagonal), complicating gradient analysis. Uniform neighbor distance simplifies smoothing, clustering, and density calculations across geographically diverse urban areas.

05

Schemaless Datastore

Built in 2014 when Uber outgrew its Postgres setup, Schemaless is a MySQL-backed distributed datastore. The core unit is an immutable JSON cell referenced by row key, column name, and ref key. Updates create new versions (append-only, never overwrite).

Schemaless Node Architecture

graph TD
    subgraph Clients["Client Applications"]
        C1["Service A"]
        C2["Service B"]
    end

    subgraph Workers["Worker Nodes (Stateless)"]
        W1["Worker 1
HTTP Handler"]
        W2["Worker 2
HTTP Handler"]
    end

    subgraph Storage["Storage Nodes"]
        subgraph Shard1["Shard Group"]
            M1["Master"]
            MIN1["Minion 1"]
            MIN2["Minion 2"]
        end
        subgraph Shard2["Shard Group"]
            M2["Master"]
            MIN3["Minion 1"]
            MIN4["Minion 2"]
        end
    end

    C1 --> W1
    C2 --> W2
    W1 -->|"writes"| M1
    W1 -->|"reads"| MIN1
    W2 -->|"writes"| M2
    W2 -->|"reads"| MIN3
    M1 -->|"async repl"| MIN1
    M1 -->|"async repl"| MIN2
    M2 -->|"async repl"| MIN3
    M2 -->|"async repl"| MIN4

    style M1 fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style M2 fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style W1 fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style W2 fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style C1 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style C2 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style MIN1 fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8
    style MIN2 fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8
    style MIN3 fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8
    style MIN4 fill:#0F1B2A,stroke:#3A7CA5,color:#F5E6C8

Data Model

Column	Type	Purpose
`added_id`	Auto-increment PK	Total ordering within shard
`row_key`	UUID	Shard routing (hash-based)
`column_name`	String	Logical column within entity
`ref_key`	Integer	Version identifier (higher = newer)
`body`	Compressed JSON	Immutable cell payload

Failure Handling

Circuit breakers detect node failures. If a master is down, writes buffer to random alternate masters via hinted handoff until replication catches up. Fixed 4,096 shards mapped to storage nodes. Later evolved toward distributed SQL, migrating from InnoDB to MyRocks (RocksDB-based) for better compression and write performance.

06

Ringpop

Ringpop is an application-layer sharding library providing consistent hash ring membership, SWIM gossip protocol, and request forwarding. Available in Go and Node.js.

Ringpop Handle-or-Forward Pattern

graph LR
    REQ["Incoming
Request"]
    HASH["Hash to
Ring Position"]

    subgraph Ring["Consistent Hash Ring (Red-Black Tree)"]
        N1["Node A
(replica points)"]
        N2["Node B
(replica points)"]
        N3["Node C
(replica points)"]
    end

    REQ --> HASH
    HASH -->|"owns key"| N1
    HASH -->|"forward via
TChannel"| N2
    N1 -.->|"gossip / SWIM"| N2
    N2 -.->|"gossip / SWIM"| N3
    N3 -.->|"gossip / SWIM"| N1

    style REQ fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style HASH fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style N1 fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style N2 fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style N3 fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8

Core Components

Consistent Hash Ring

Red-black tree with O(log n) lookups. FarmHash for fast, even distribution. Uniform virtual replica points per physical node.

20,000-40,000 ops/sec via TChannel

SWIM Gossip Protocol

Nodes randomly ping each other over TCP. Failed direct pings trigger indirect ping-req probes. Member statuses: alive, suspect, faulty. Incarnation numbers as logical clocks.

Flap Damping

Identifies unstable nodes with erratic state transitions. Penalties accumulate; exceeding suppression limits triggers removal from ring. Validated across multiple nodes via damp-req protocol.

Convergence

Membership checksums compared on contact. Mismatches trigger bidirectional full syncs to exchange complete membership data. Uniquely retains "down" members in list for partition recovery.

07

Peloton Resource Scheduler

Peloton co-schedules mixed workload types (batch, stateless, stateful, daemon) in a single cluster. Designed for 50,000+ hosts with millions of containers. Built atop Apache Mesos with Zookeeper for service discovery.

Peloton Daemon Architecture

graph TD
    subgraph API["API Layer"]
        CLIENT["Peloton
Client"]
    end

    subgraph Daemons["Core Daemons"]
        JM["Job Manager
Lifecycle + upgrades"]
        RM["Resource Manager
Pool hierarchy"]
        PE["Placement Engine
Task-to-host mapping"]
        HM["Host Manager
Mesos abstraction"]
    end

    subgraph Backend["Backend Infrastructure"]
        ZK["Zookeeper
Discovery + election"]
        MESOS["Apache Mesos
Cluster manager"]
    end

    CLIENT --> JM
    JM --> RM
    RM --> PE
    PE --> HM
    HM --> MESOS
    JM --> ZK
    RM --> ZK

    style CLIENT fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style JM fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style RM fill:#1B2838,stroke:#C9A84C,color:#F5E6C8
    style PE fill:#1B2838,stroke:#E8A317,color:#F5E6C8
    style HM fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style ZK fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style MESOS fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8

Resource Pool Hierarchy

Divides cluster by org/team with four dimensions per pool:

Dimension	Description
Reservation	Minimum guaranteed resources for the pool
Limit	Maximum consumable resources (hard ceiling)
Share	Relative weight for free capacity allocation
Entitlement	Dynamically adjusted current usable resources

Workload Types

Stateless

Long-running services with rolling upgrades and health checks.

Stateful

Cassandra, MySQL, Redis with local disk requirements and careful placement.

Batch

Hadoop, Spark, TensorFlow jobs. Preemptible for elastic resource sharing.

Daemon

Per-host agents like Kafka brokers and HAProxy for system-level services.

08

Map & ETA Prediction Pipeline

Uber's routing engine handles hundreds of thousands of ETA requests per second at single-digit millisecond latency. The DeepETA ML model refines base routing estimates using an encoder-decoder with self-attention.

ETA Prediction Pipeline

graph LR
    subgraph Request["Client Request"]
        REQ["Ride Request
(origin, destination)"]
    end

    subgraph Routing["Routing Engine (Gurafu)"]
        UROUTE["uRoute
Service"]
        GRAPH["Road Graph
(layered cells)"]
    end

    subgraph ML["ML Prediction"]
        MICHEL["Michelangelo
Online Serving"]
        DEEP["DeepETA Model
(encoder-decoder)"]
    end

    subgraph Output["Response"]
        ETA["Refined ETA
+ Route Line"]
    end

    REQ --> UROUTE
    UROUTE --> GRAPH
    GRAPH -->|"base ETA
+ route"| MICHEL
    MICHEL --> DEEP
    DEEP -->|"residual
correction"| ETA

    style REQ fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style UROUTE fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style GRAPH fill:#1B2838,stroke:#6B8E4E,color:#F5E6C8
    style MICHEL fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style DEEP fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8
    style ETA fill:#0F1B2A,stroke:#E8A317,color:#F5E6C8

Routing Engine (Gurafu)

Evolved from open-source OSRM to an in-house engine. The graph model uses nodes (intersections) and edges (road segments with turn restrictions, speed limits, one-way constraints). Originally used contraction hierarchies (12-hour global rebuilds), but production now divides the graph into layers of small cells that can be preprocessed in parallel when traffic changes.

DeepETA Model

Hybrid approach: routing engine predicts base ETA, then ML model predicts the residual between routing estimate and real-world outcomes. Uses linear attention variant (O(Kd^2) vs standard O(K^2d)) for production latency. Only ~0.25% of parameters touched per request.

Asymmetric Loss

Uses asymmetric Huber loss with separate underprediction vs overprediction costs, tunable between squared and absolute error modes.

Michelangelo ML Platform

~400

Active ML Projects

20K+

Monthly Training Jobs

5K+

Models in Production

10M/s

Peak Predictions

Michelangelo Three-Plane Architecture

graph TD
    subgraph Control["Control Plane"]
        API["APIs &
Lifecycle Mgmt"]
        CANVAS["Canvas
Auto-Retrain"]
    end

    subgraph Offline["Offline Data Plane"]
        FEAT["Feature
Computation"]
        TRAIN["Distributed
Training (Spark)"]
        BATCH["Batch
Inference"]
    end

    subgraph Online["Online Data Plane"]
        SERVE["Real-Time
Inference"]
        FSERVE["Feature
Serving"]
        JOB["Job Controller
(Ray + Spark)"]
    end

    API --> Offline
    API --> Online
    CANVAS --> TRAIN
    TRAIN --> SERVE
    FEAT --> FSERVE
    JOB --> TRAIN
    JOB --> BATCH

    style API fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style CANVAS fill:#1B2838,stroke:#E8A317,color:#F5E6C8
    style FEAT fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style TRAIN fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style BATCH fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style SERVE fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style FSERVE fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style JOB fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8

09

Surge Pricing Architecture

Surge pricing dynamically adjusts ride prices based on real-time supply/demand imbalance. The system is event-driven, prioritizing data freshness and availability over strict consistency (AP over CP).

Surge Pricing Data Pipeline

graph TD
    subgraph Events["Event Sources"]
        RIDE["Ride
Requests"]
        GPS["Driver
Locations"]
        TRIP["Trip
Completions"]
    end

    subgraph Ingestion["Ingestion"]
        KFK["Apache
Kafka"]
    end

    subgraph Processing["Stream Processing"]
        FLINK["Apache Flink
(geospatial streams)"]
        H3C["H3 Cell
Aggregation"]
    end

    subgraph Storage["Analytics Storage"]
        PINOT["Apache Pinot
(OLAP)"]
        CACHE["Redis /
Memcached"]
        PRSTO["Presto
(ad-hoc)"]
    end

    subgraph Output["Pricing Output"]
        ALGO["Surge
Algorithm"]
        PRED["Demand
Forecast"]
    end

    RIDE --> KFK
    GPS --> KFK
    TRIP --> KFK
    KFK --> FLINK
    FLINK --> H3C
    H3C --> PINOT
    H3C --> CACHE
    PINOT --> PRSTO
    CACHE --> ALGO
    PINOT --> ALGO
    PRED --> ALGO

    style KFK fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style FLINK fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style H3C fill:#1B2838,stroke:#6B8E4E,color:#F5E6C8
    style RIDE fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style GPS fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style TRIP fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style PINOT fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style CACHE fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style PRSTO fill:#0F1B2A,stroke:#5B9EC4,color:#F5E6C8
    style ALGO fill:#1B2838,stroke:#E8A317,color:#F5E6C8
    style PRED fill:#1B2838,stroke:#E8A317,color:#F5E6C8

Predictive Modeling

Beyond reactive pricing, the system incorporates demand forecasting to anticipate surge conditions before they fully materialize. Supply/demand is measured per H3 hexagonal cell in each city, incorporating real-time factors including traffic, weather, and local events.

10

Supporting Infrastructure

Cross-cutting systems that power Uber's microservice ecosystem: workflow orchestration, RPC protocols, distributed tracing, and a robust open-source portfolio.

Cadence Workflow Architecture

graph TD
    subgraph Workers["Application Workers"]
        W1["Worker 1"]
        W2["Worker 2"]
    end

    subgraph Cadence["Cadence Engine"]
        FE["Front End
(stateless)"]
        HIST["History Service
(workflow steps)"]
        MATCH["Matching Service
(task assignment)"]
        IW["Internal
Worker"]
    end

    subgraph Backend["Storage Backend"]
        CASS["Cassandra /
MySQL / Postgres"]
        ES["Elasticsearch"]
        KFK2["Kafka
(optional)"]
    end

    W1 --> FE
    W2 --> FE
    FE --> MATCH
    MATCH --> HIST
    HIST --> IW
    HIST --> CASS
    FE --> ES
    HIST --> KFK2

    style FE fill:#A68B3C,stroke:#C9A84C,color:#0F1B2A
    style HIST fill:#1B2838,stroke:#C2703E,color:#F5E6C8
    style MATCH fill:#1B2838,stroke:#3A7CA5,color:#F5E6C8
    style IW fill:#1B2838,stroke:#5B9EC4,color:#F5E6C8
    style W1 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style W2 fill:#2A3550,stroke:#D4BC82,color:#F5E6C8
    style CASS fill:#0F1B2A,stroke:#A0522D,color:#F5E6C8
    style ES fill:#0F1B2A,stroke:#A0522D,color:#F5E6C8
    style KFK2 fill:#0F1B2A,stroke:#A0522D,color:#F5E6C8

RPC Protocols

TChannel

Uber's RPC framing protocol. Supports out-of-order responses (prevents head-of-line blocking). Benchmarked at 20,000-40,000 ops/sec. Used by Ringpop for request forwarding.

Largely succeeded by gRPC

YARPC

Uber's RPC framework supporting multiple transports: TChannel, HTTP, and gRPC. Used by Peloton and other services for cross-service communication via Protocol Buffers.

Active

Jaeger

Distributed tracing system, essential for debugging across 2,200+ microservices. Implements OpenTracing/OpenTelemetry standards. Now a CNCF graduated project.

CNCF Open Source

Open Source Portfolio

Project	Purpose	Status
H3	Hexagonal geospatial indexing (C library, 64-bit cells)	Open Source
Cadence	Workflow orchestration engine (now CNCF)	CNCF
Jaeger	Distributed tracing (OpenTelemetry)	CNCF Graduated
Peloton	Unified resource scheduling (Mesos-based)	Open Source
Ringpop	Consistent hash ring + gossip protocol	Open Source
uReplicator	Kafka cross-region replication	Open Source
uForwarder	Push-based Kafka consumer proxy	Open Source
TChannel	RPC framing protocol	Evolved to gRPC

11

Acronyms & Key Terms

CNCFCloud Native Computing Foundation

DISCODispatch Optimization

DOMADomain-Oriented Microservice Architecture

ETAEstimated Time of Arrival

FK/IKForward/Inverse Kinematics (routing)

GPSGlobal Positioning System

H3Hexagonal Hierarchical Spatial Index

HDFSHadoop Distributed File System

MLMachine Learning

OLAPOnline Analytical Processing

OSRMOpen Source Routing Machine

RPCRemote Procedure Call

S2Google S2 Geometry Library

SWIMScalable Weakly-consistent Infection-style Membership

YARPCYet Another RPC (Uber's RPC framework)