Architecture Maps

Spotify Platform Architecture

Interactive architecture map of Spotify's engineering infrastructure — microservices, data pipelines, recommendation systems, and developer tooling compiled from publicly available sources.

Public Sources Only 810+ Microservices 1T+ Events/Day Updated: Mar 2026
01

Platform Overview

Spotify's architecture is built on event-driven microservices running on Google Cloud Platform. Over 90 teams and 600+ developers manage 810+ active services, processing more than 1 trillion events per day across 1,800+ distinct event types.

810+
Microservices
1T+
Events / Day
1,800+
Event Types
600+
Developers
200M+
Monthly Users
High-Level Platform Architecture
graph TD
    subgraph Clients["Client Layer"]
        MOB["Mobile Apps
(iOS / Android)"] WEB["Web Player"] DSK["Desktop App"] end subgraph Edge["Edge & Delivery"] CDN["Multi-CDN
(Akamai, CloudFront, Fastly)"] API["API Gateway"] end subgraph Services["Microservices (810+)"] STREAM["Audio
Streaming"] SEARCH["Search"] REC["Recommendations"] USER["User &
Auth"] PAY["Payments"] POD["Podcast
Pipeline"] end subgraph Platform["Platform Layer"] K8S["Kubernetes
(GCP)"] EDI["Event Delivery
Infrastructure"] BACK["Backstage
Dev Portal"] end Clients --> Edge Edge --> Services Services --> Platform style MOB fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style CDN fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style API fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style REC fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style K8S fill:#1A1530,stroke:#5DADE2,color:#E8E0F0 style EDI fill:#1A1530,stroke:#8B44AC,color:#E8E0F0 style BACK fill:#1A1530,stroke:#F1C40F,color:#E8E0F0
02

Event-Driven Microservices

Each microservice owns its own database and handles a single domain concern. Services communicate via gRPC with Protobuf for synchronous calls and Google Cloud Pub/Sub for asynchronous event processing.

Service Communication Patterns
graph LR
    subgraph Sync["Synchronous (gRPC)"]
        SA["Service A"] -->|"gRPC / Protobuf"| SB["Service B"]
        SB -->|"gRPC / Protobuf"| SC["Service C"]
    end

    subgraph Async["Asynchronous (Events)"]
        SD["Service D"] -->|"publish"| PS["Cloud Pub/Sub
Topic"] PS -->|"subscribe"| SE["Service E"] PS -->|"subscribe"| SF["Service F"] end subgraph Discovery["Service Discovery"] NL["Nameless
(registry)"] end SA -.->|"register"| NL SB -.->|"register"| NL style SA fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style SB fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style SC fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style SD fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style PS fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style SE fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style SF fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style NL fill:#E67E22,stroke:#E67E22,color:#E8E0F0

Event Delivery Infrastructure (EDI)

Originally built on Kafka 0.7, Spotify migrated to Google Cloud Pub/Sub in 2016-2017. The Event Service parses and routes events to per-type Pub/Sub topics. When teams define event schemas, Kubernetes operators automatically deploy queues, anonymization pipelines, and streaming jobs.

Event Delivery Pipeline
graph LR
    subgraph Producers["Event Producers"]
        APP["Client Apps"]
        SVC["Backend
Services"] end subgraph EDI["Event Delivery Infrastructure"] ES["Event Service
(parser + router)"] SCHEMA["Schema
Registry"] end subgraph Transport["Transport"] PUBSUB["Cloud Pub/Sub
(per-type topics)"] end subgraph Processing["Processing"] ANON["Anonymization
Pipeline"] ETL["ETL /
Dataflow Jobs"] end subgraph Storage["Storage"] BQ["BigQuery"] GCS["Cloud Storage"] end APP --> ES SVC --> ES ES --> SCHEMA ES --> PUBSUB PUBSUB --> ANON ANON --> ETL ETL --> BQ ETL --> GCS style ES fill:#8B44AC,stroke:#8B44AC,color:#E8E0F0 style PUBSUB fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style BQ fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style GCS fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style ANON fill:#C0392B,stroke:#C0392B,color:#E8E0F0
Scale

The Event Delivery Infrastructure processes 1+ trillion events per day (~70 TB compressed daily) across 1,800+ distinct event types. The largest single service handles ~10 million requests/second.

03

Squad / Tribe / Chapter / Guild

Introduced in 2011 by Henrik Kniberg and Anders Ivarsson, Spotify's organizational model maps directly to how microservices are owned and operated. Each squad functions as a mini-startup with full autonomy — a textbook case of Conway's Law.

Organizational Matrix
graph TD
    subgraph Tribe1["Tribe: Mobile Player (<100 people)"]
        SQ1["Squad: Playback
(6-12 people)"] SQ2["Squad: Queue
Management"] SQ3["Squad: Offline
Mode"] end subgraph Tribe2["Tribe: Content Platform"] SQ4["Squad: Search
Experience"] SQ5["Squad: Catalog
Ingestion"] end subgraph Chapters["Chapters (within tribe)"] CH1["Backend
Engineers"] CH2["iOS
Engineers"] end subgraph Guilds["Guilds (cross-tribe)"] G1["AI Guild"] G2["DevOps Guild"] end SQ1 -.-> CH1 SQ2 -.-> CH1 SQ3 -.-> CH2 SQ4 -.-> CH1 SQ1 -.-> G1 SQ4 -.-> G1 SQ5 -.-> G2 style SQ1 fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style SQ2 fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style SQ3 fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style SQ4 fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style SQ5 fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style CH1 fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style CH2 fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style G1 fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style G2 fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0

Squads

6-12 people, cross-functional, full autonomy. Owns one or more microservices end-to-end with a long-term mission.

Core Unit

Tribes

Collection of squads grouped by business area, capped at ~100 people (Dunbar's number) with a Tribe Lead.

Business Area

Chapters

Same-skill engineers within a tribe. Led by a Chapter Lead who serves as line manager.

Skill Group

Guilds

Informal, cross-tribe communities of interest. Volunteer-led, no formal hierarchy.

Community
04

Backstage Developer Portal

Built internally to manage 800+ microservices across 500+ engineering teams. Open-sourced in 2020, now a CNCF incubating project. Three-layer architecture: React frontend, Node.js backend, PostgreSQL database.

Backstage Architecture
graph TD
    subgraph Frontend["Frontend (React)"]
        UI["Plugin-based UI
(extension tree)"] end subgraph Backend["Backend (Node.js)"] CAT["Software
Catalog"] SCAFF["Software
Templates"] TDOC["TechDocs
(Markdown)"] K8P["Kubernetes
Plugin"] SRCH["Search"] end subgraph Data["Database Layer"] PG["PostgreSQL
(production)"] SQLITE["SQLite
(development)"] end subgraph External["External Integrations"] GH["GitHub /
GitLab"] K8S["Kubernetes
Clusters"] CACHE["Redis /
Memcache"] end UI --> Backend Backend --> PG Backend --> SQLITE CAT --> GH SCAFF --> GH K8P --> K8S Backend --> CACHE style UI fill:#F1C40F,stroke:#F1C40F,color:#0D0B14 style CAT fill:#8B44AC,stroke:#8B44AC,color:#E8E0F0 style SCAFF fill:#8B44AC,stroke:#8B44AC,color:#E8E0F0 style TDOC fill:#8B44AC,stroke:#8B44AC,color:#E8E0F0 style PG fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style GH fill:#1A1530,stroke:#5DADE2,color:#E8E0F0

Core Plugins

Plugin Purpose Type
Software Catalog Central metadata repository for all services, APIs, libraries, and teams (YAML descriptors) Core
Software Templates Standardized project scaffolding — code skeletons, variable injection, VCS publish Core
TechDocs Docs-like-code: Markdown alongside code, rendered and searchable inside Backstage Core
Kubernetes View pod status, deployments, and logs for cataloged services Plugin
Search Unified search across catalog entities and documentation Core
Commercial Offering

Spotify Portal for Backstage adds no-code setup, service maturity scoring, incident management integration, and advanced analytics on top of the open-source platform.

05

Audio Streaming & CDN

Spotify serves 50+ million tracks plus images and assets to 200+ million monthly active users through a multi-CDN strategy with adaptive bitrate streaming.

Content Delivery Architecture
graph LR
    subgraph Origin["Origin Storage"]
        S3["AWS S3"]
        GCS["Google Cloud
Storage"] end subgraph CDN["Multi-CDN Layer"] AK["Akamai
(audio primary)"] CF["AWS CloudFront
(audio secondary)"] FAST["Fastly
(metadata, images)"] end subgraph Quality["Audio Formats"] OGG["Ogg Vorbis
(primary)"] AAC["AAC"] end subgraph Bitrate["Adaptive Bitrate"] B96["96 kbps"] B160["160 kbps"] B320["320 kbps"] end subgraph Client["Clients"] PLAY["Player"] end Origin --> CDN OGG --> B96 OGG --> B160 OGG --> B320 CDN --> PLAY Bitrate -.-> CDN style S3 fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style GCS fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style AK fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style CF fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style FAST fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style PLAY fill:#8B44AC,stroke:#8B44AC,color:#E8E0F0

Akamai + CloudFront

Business-critical audio streaming with low latency and high bandwidth. Primary CDN tier for music playback.

Streaming

Fastly (VCL Edge Logic)

Images, client updates, metadata. Uses Varnish Configuration Language for intelligent caching.

Edge

SquadCDN

Internal self-service CDN configuration tool combining Fastly APIs with VCL for team-managed rules.

DevEx
06

Personalization & Recommendations

Spotify's recommendation system powers Discover Weekly, Release Radar, Daily Mixes, and Wrapped. It fuses three core techniques: collaborative filtering, NLP-based content analysis, and deep learning on raw audio.

Recommendation Pipeline
graph TD
    subgraph Signals["Input Signals"]
        LISTEN["Listening
History"] PLAYLISTS["Playlist
Co-occurrence"] SOCIAL["Web Crawl
(blogs, reviews)"] AUDIO["Raw Audio
Spectrograms"] end subgraph Models["ML Models"] CF["Collaborative
Filtering"] NLP["NLP Content
Analysis"] CNN["CNN Audio
Analysis"] end subgraph Ranking["Ranking Layer"] RANK["ML Ranking
(GBDT / Neural)"] CONTEXT["Context Features
(time, device)"] end subgraph Output["Output"] DW["Discover
Weekly"] RR["Release
Radar"] DM["Daily
Mixes"] end LISTEN --> CF PLAYLISTS --> CF SOCIAL --> NLP AUDIO --> CNN CF --> RANK NLP --> RANK CNN --> RANK CONTEXT --> RANK RANK --> Output style CF fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style NLP fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style CNN fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style RANK fill:#C0392B,stroke:#C0392B,color:#E8E0F0 style DW fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style RR fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style DM fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0

Three Pillars of Recommendation

Collaborative Filtering

Analyzes billions of user-created playlists for co-occurrence patterns. Matrix factorization at massive scale.

ML Playlist data

Content-Based NLP

Powered by The Echo Nest (acquired 2014). Crawls music blogs and reviews to build "cultural vectors" for artists.

NLP Web crawl

CNN Audio Analysis

4 convolutional + 3 fully-connected layers analyze spectrograms. Extracts tempo, key, energy, danceability directly from audio.

Deep Learning Cold start
Taste Profiles

Each user's musical preferences are represented as a multi-dimensional vector, continuously updated via real-time Kafka streams of listening events. These profiles feed all ranking models with user-specific affinity scores.

07

Data Infrastructure

Spotify's data platform handles 1+ trillion events/day (~70 TB compressed daily). After migrating from on-premise Hadoop to Google Cloud Platform in 2016, most pipelines are now written in Scio, Spotify's open-source Scala API for Apache Beam.

Data Platform Evolution (On-Prem to GCP)
graph LR
    subgraph Legacy["Era 1: On-Premise (Pre-2016)"]
        KAFKA["Apache Kafka"]
        HADOOP["Hadoop
(~2,500 nodes)"] LUIGI["Luigi
(Python orchestrator)"] HIVE["Apache Hive"] CASS["Cassandra"] end subgraph Current["Era 2: Google Cloud (2016+)"] PUBSUB["Cloud Pub/Sub"] DATAFLOW["Cloud Dataflow
(Apache Beam)"] BIGQ["BigQuery"] BIGTABLE["Cloud Bigtable"] GCSTORE["Cloud Storage"] end KAFKA -->|"replaced by"| PUBSUB HADOOP -->|"replaced by"| DATAFLOW HIVE -->|"replaced by"| BIGQ CASS -->|"replaced by"| BIGTABLE style KAFKA fill:#1A1530,stroke:#C0392B,color:#E8E0F0 style HADOOP fill:#1A1530,stroke:#C0392B,color:#E8E0F0 style PUBSUB fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style DATAFLOW fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style BIGQ fill:#8B44AC,stroke:#8B44AC,color:#E8E0F0 style BIGTABLE fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0

GCP Component Mapping

GCP Service Replaces Purpose
Cloud Pub/Sub Apache Kafka Event transport and queuing
Cloud Dataflow Hadoop / Storm Managed batch + streaming execution
BigQuery Hive / HDFS SQL analytics warehouse
Cloud Bigtable Cassandra High-speed key-value lookups
Cloud Storage HDFS Object / file storage
Cloud Spanner PostgreSQL (some) Transactional storage

Scio — Spotify's Scala API for Apache Beam

Scio provides a unified batch + streaming programming model with two core primitives: ParDo (parallel processing) and GroupByKey (shuffle). It has native connectors for all GCP services and powers most data pipelines at Spotify.

Wrapped Pipeline

Wrapped 2019 was the largest Dataflow job ever run — 5x larger than 2018 at 75% of the cost. Wrapped 2020 introduced Sort Merge Bucket (SMB) joins to eliminate expensive shuffles, replacing Bigtable as an intermediate layer.

08

Backend Services & Technology Stack

Spotify's backend is polyglot — primarily Java and Python, with Scala for data pipelines and growing adoption of Kotlin and Go. All services run on Kubernetes on Google Cloud Platform.

Technology Stack
graph TD
    subgraph Languages["Languages & Frameworks"]
        JAVA["Java
(Apollo, Spring)"] PY["Python
(Luigi, ML)"] SCALA["Scala
(Scio pipelines)"] KTGO["Kotlin / Go
(newer services)"] end subgraph Databases["Databases"] CASS["Cassandra
(playlists, user data)"] BT["Bigtable
(recommendations)"] PG["PostgreSQL
(payments)"] ES["Elasticsearch
(search index)"] end subgraph Infra["Infrastructure"] K8S["Kubernetes
(GCP)"] DOCKER["Docker"] GRPC["gRPC +
Protobuf"] PUBS["Cloud
Pub/Sub"] end subgraph DevTools["Developer Tools"] BSTG["Backstage"] APOLLO["Apollo
Libraries"] end Languages --> Infra Languages --> Databases DevTools --> Languages style JAVA fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0 style PY fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style SCALA fill:#C0392B,stroke:#C0392B,color:#E8E0F0 style K8S fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style CASS fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style ES fill:#F1C40F,stroke:#F1C40F,color:#0D0B14 style BSTG fill:#F1C40F,stroke:#F1C40F,color:#0D0B14

Apollo Libraries

Spotify's open-source Java libraries for microservices: HTTP server, URI routing, middleware. Used in production for years.

Java Open source

Luigi

Python-based workflow orchestrator born at Spotify. Manages complex pipeline dependencies and job scheduling.

Python Open source

Nameless

Internal service discovery system. Services register on startup and become discoverable to receive traffic.

Infra Internal
09

Podcast Ingestion & Delivery

Spotify's podcast catalog grows by hundreds of thousands of episodes per day. Each episode passes through a DAG-driven pipeline of 6+ ML models for transcription, language detection, topic classification, and preview generation.

Podcast Processing Pipeline
graph LR
    subgraph Ingest["Ingestion"]
        API["Central API
(new episodes)"] DAG["DAG Router"] end subgraph ML["ML Enrichment (6+ models)"] TRANS["Transcription"] LANG["Language
Detection"] SOUND["Sound Event
Detection"] TOPIC["Topic
Classification"] PREV["Preview
Generation"] end subgraph Infra["Infrastructure"] KLIO["Klio Framework
(Apache Beam)"] GPU["NVIDIA T4 GPUs
(16GB each)"] DF["Cloud Dataflow
(autoscaling)"] end subgraph Delivery["Delivery"] CDN["Multi-CDN
(Akamai, CloudFront)"] ELIDX["Elasticsearch
(metadata index)"] end API --> DAG DAG --> ML ML --> KLIO KLIO --> GPU KLIO --> DF ML --> CDN ML --> ELIDX style API fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style DAG fill:#4B3F8C,stroke:#8B44AC,color:#E8E0F0 style TRANS fill:#E67E22,stroke:#E67E22,color:#E8E0F0 style KLIO fill:#1A9E6F,stroke:#1A9E6F,color:#E8E0F0 style GPU fill:#C0392B,stroke:#C0392B,color:#E8E0F0 style CDN fill:#1A4FA0,stroke:#1A4FA0,color:#E8E0F0
Klio Framework

Spotify's open-source audio processing framework built on Apache Beam. Switching from batch to streaming deployment reduced median preview generation latency from 111.7 minutes to 3.7 minutes — a 30x improvement.

ML Processing Stack

Component Technology Purpose
ML Frameworks TensorFlow, PyTorch, Scikit-learn, Gensim Ensemble of 6+ models per episode
GPU Hardware NVIDIA T4 (16GB) Model inference with fusion breaks for swapping
Orchestration Cloud Dataflow Dynamic autoscaling for batch + streaming
Packaging Poetry + Docker Dependency management and containerization
10

Acronym Reference

AACAdvanced Audio Coding
CDNContent Delivery Network
CNCFCloud Native Computing Foundation
CNNConvolutional Neural Network
DAGDirected Acyclic Graph
EDIEvent Delivery Infrastructure
ETLExtract, Transform, Load
GBDTGradient Boosted Decision Trees
GCPGoogle Cloud Platform
GCSGoogle Cloud Storage
gRPCGoogle Remote Procedure Call
HDFSHadoop Distributed File System
IKIntegration Control (Kafka context)
K8sKubernetes
MLMachine Learning
NLPNatural Language Processing
SMBSort Merge Bucket
VCLVarnish Configuration Language
Diagram
100%
Scroll to zoom · Drag to pan · Esc to close