PostgreSQL Architecture Map — Architecture Guide

01

Architecture Overview

PostgreSQL is a multi-process, object-relational database that uses MVCC for concurrency and Write-Ahead Logging for durability. The postmaster supervises all child processes, allocates shared memory at startup, and forks one backend per client connection.

8 KB

Page Size

16 MB

WAL Segment

100

Default Max Connections

128 MB

Default Shared Buffers

6

Index Types

PostgreSQL High-Level Architecture

graph TB
    subgraph Clients["Client Applications"]
        C1["psql / App 1"]
        C2["App 2"]
        C3["App N"]
    end

    PM["Postmaster
Supervisor Daemon"]

    subgraph Backends["Backend Processes"]
        B1["Backend 1"]
        B2["Backend 2"]
        B3["Backend N"]
    end

    subgraph BG["Background Processes"]
        CK["Checkpointer"]
        BW["BG Writer"]
        WW["WAL Writer"]
        AV["Autovacuum"]
        SC["Stats Collector"]
    end

    subgraph SM["Shared Memory"]
        SB["Shared Buffer Pool"]
        WB["WAL Buffers"]
        CL["CLOG"]
        LK["Lock Tables"]
    end

    subgraph Storage["Disk Storage"]
        DF["Data Files
(base/)"]
        WF["WAL Files
(pg_wal/)"]
        XA["CLOG
(pg_xact/)"]
    end

    C1 & C2 & C3 --> PM
    PM --> B1 & B2 & B3
    B1 & B2 & B3 --> SM
    BG --> SM
    SM --> Storage

    style PM fill:#7B4F9E,stroke:#9B6FBE,color:#F5F0E8
    style SB fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style WB fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style CL fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style LK fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style DF fill:#2E2E4F,stroke:#E6A817,color:#E8E4F0
    style WF fill:#2E2E4F,stroke:#E6A817,color:#E8E4F0
    style XA fill:#2E2E4F,stroke:#E6A817,color:#E8E4F0
    style B1 fill:#252540,stroke:#5B8FB9,color:#E8E4F0
    style B2 fill:#252540,stroke:#5B8FB9,color:#E8E4F0
    style B3 fill:#252540,stroke:#5B8FB9,color:#E8E4F0
    style CK fill:#252540,stroke:#9B6FBE,color:#E8E4F0
    style BW fill:#252540,stroke:#9B6FBE,color:#E8E4F0
    style WW fill:#252540,stroke:#9B6FBE,color:#E8E4F0
    style AV fill:#252540,stroke:#9B6FBE,color:#E8E4F0
    style SC fill:#252540,stroke:#9B6FBE,color:#E8E4F0
    style C1 fill:#1C1C35,stroke:#B0A8C0,color:#E8E4F0
    style C2 fill:#1C1C35,stroke:#B0A8C0,color:#E8E4F0
    style C3 fill:#1C1C35,stroke:#B0A8C0,color:#E8E4F0

02

Process Architecture

PostgreSQL uses a multi-process model (not threads). The postmaster listens on port 5432, forks one backend per client connection, and supervises background maintenance processes. When a child crashes, the postmaster terminates all children and initiates crash recovery.

Postmaster Process Tree

graph TD
    PM["Postmaster
PID 1 · ServerLoop"]

    subgraph Maintenance["Background Maintenance"]
        CK["Checkpointer
Flushes dirty pages"]
        BW["BG Writer
Proactive page flush"]
        WW["WAL Writer
Flushes WAL buffers"]
        AV["Autovacuum Launcher
Spawns workers"]
        SC["Stats Collector
Activity statistics"]
        AR["Archiver
WAL archiving"]
        IO["IO Workers
Async page reads"]
    end

    subgraph Repl["Replication"]
        WS["WAL Sender
Streams WAL to standby"]
        WR["WAL Receiver
(on standby only)"]
        SU["Startup Process
WAL replay"]
    end

    subgraph Workers["Backend Workers"]
        B1["Client Backend"]
        B2["Client Backend"]
        PQ["Parallel Query
Worker"]
        BW2["Background
Worker (ext)"]
    end

    PM --> Maintenance
    PM --> Repl
    PM --> Workers

    style PM fill:#7B4F9E,stroke:#9B6FBE,color:#F5F0E8
    style CK fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style BW fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style WW fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style AV fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style SC fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style AR fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style IO fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style WS fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style WR fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style SU fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style B1 fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0
    style B2 fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0
    style PQ fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0
    style BW2 fill:#1C1C35,stroke:#D4839E,color:#E8E4F0

Shutdown Coordination

The postmaster supports three shutdown modes, each triggered by a different signal. Smart shutdown waits for clients to disconnect; fast shutdown terminates backends immediately and checkpoints; immediate shutdown exits ungracefully and requires crash recovery on restart.

Mode	Signal	Behavior
Smart	`SIGTERM`	Wait for all clients to disconnect, then checkpoint and exit
Fast	`SIGINT`	Terminate backends immediately, perform final checkpoint
Immediate	`SIGQUIT`	Ungraceful exit; crash recovery required on restart

Inter-Process Communication

Processes communicate through shared memory (buffer pool, lock tables, PGPROC array, CLOG) for data and Unix signals for control flow. The max_connections parameter (default 100) limits backend processes, and max_worker_processes (default 8) limits custom background workers.

03

Shared Memory & Buffer Manager

At startup, the postmaster allocates a large shared memory segment containing the buffer pool, WAL buffers, CLOG cache, lock tables, and process tracking arrays. Each backend inherits access to this shared region.

Shared Memory Layout

graph LR
    subgraph SharedMem["Shared Memory Region"]
        SB["Shared Buffer Pool
shared_buffers (128 MB)"]
        WB["WAL Buffers
wal_buffers (~3%)"]
        CL["CLOG Buffers
Transaction states"]
        PP["PGPROC Array
Per-backend tracking"]
        LT["Lock Tables
Heavy/Light/Spin"]
    end

    subgraph Local["Per-Backend Memory"]
        WM["work_mem (4 MB)
Sorts, hashes"]
        MM["maintenance_work_mem
(64 MB) VACUUM, INDEX"]
        TB["temp_buffers (8 MB)
Temp table cache"]
    end

    B["Backend Process"] --> SharedMem
    B --> Local

    style SB fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style WB fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style CL fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style PP fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style LT fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style WM fill:#252540,stroke:#D4839E,color:#E8E4F0
    style MM fill:#252540,stroke:#D4839E,color:#E8E4F0
    style TB fill:#252540,stroke:#D4839E,color:#E8E4F0
    style B fill:#7B4F9E,stroke:#9B6FBE,color:#F5F0E8

Buffer Pool Internals

All data is stored in 8 KB pages. The buffer pool maintains three parallel structures: buffer blocks (actual page data), buffer descriptors (64-byte metadata per slot tracking dirty/valid/pinned state), and a partitioned hash table mapping (tablespace, relation, fork, block) to slot index for O(1) lookup.

Clock-Sweep Eviction

When a free buffer slot is needed, a clock hand sweeps through descriptors. If a page's usage count is greater than zero, it is decremented and skipped. If usage count is zero and the page is unpinned, it is selected as the eviction victim. Each page access increments usage count up to a cap of 5, approximating LRU without linked-list overhead.

Ring Buffer Strategy

For large sequential scans, VACUUM, and bulk writes, PostgreSQL uses a small ring buffer (256 KB - 16 MB) instead of the main pool. This prevents large operations from evicting useful cached pages ("buffer pollution").

04

Query Processing Pipeline

Every SQL statement flows through five stages: Parser, Analyzer, Rewriter, Planner, and Executor. PostgreSQL's source code describes this as "the most complicated subsystem."

Query Pipeline Stages

graph LR
    SQL["SQL Text"] --> P["Parser
Flex + Bison"]
    P --> A["Analyzer
Semantic validation"]
    A --> R["Rewriter
View expansion"]
    R --> PL["Planner
Cost-based optimizer"]
    PL --> E["Executor
Volcano / pull model"]
    E --> RES["Result Tuples"]

    CAT["System Catalogs
pg_class, pg_proc"]
    STATS["Statistics
pg_statistic"]

    CAT -.-> A
    CAT -.-> R
    STATS -.-> PL

    style SQL fill:#252540,stroke:#B0A8C0,color:#E8E4F0
    style P fill:#1C1C35,stroke:#7B4F9E,color:#E8E4F0
    style A fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0
    style R fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style PL fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style E fill:#1C1C35,stroke:#D4839E,color:#E8E4F0
    style RES fill:#252540,stroke:#B0A8C0,color:#E8E4F0
    style CAT fill:#2E2E4F,stroke:#5B8FB9,color:#E8E4F0
    style STATS fill:#2E2E4F,stroke:#E6A817,color:#E8E4F0

Parser

Uses Flex for tokenization and Bison for grammar rules (~15,000 lines in gram.y). Produces a raw parse tree of Node structures. Only checks syntax -- does not verify that tables exist.

src/backend/parser/

Analyzer

Performs semantic analysis: looks up tables, columns, functions in system catalogs. Resolves names to OIDs, infers data types, inserts implicit casts. Produces a Query struct.

analyze.c

Rewriter

Applies rule-based transformations from pg_rewrite. Expands views into their underlying queries. Can produce multiple query trees from one input (INSTEAD rules).

rewriteHandler.c

Planner / Optimizer

Cost-based optimizer evaluating scan types (SeqScan, IndexScan, BitmapScan), join algorithms (Nested Loop, Hash, Merge), and uses GEQO for 12+ joins. Outputs a PlannedStmt.

src/backend/optimizer/

Executor

Implements the pull-based Volcano model: each plan node has ExecInit, ExecProcNode (returns one tuple), and ExecEnd. The top node pulls tuples recursively from children.

src/backend/executor/

ProcessUtility

DDL statements (CREATE, ALTER, DROP) bypass the planner/executor entirely and go through ProcessUtility, a separate dispatch function for utility commands.

utility.c

05

MVCC & Transaction Isolation

PostgreSQL implements Snapshot Isolation via MVCC. Multiple physical versions of each row coexist. Readers never block writers. Each tuple carries header fields tracking which transaction created and deleted it.

MVCC Tuple Lifecycle

graph LR
    subgraph INSERT["INSERT (XID 100)"]
        T1["Tuple v1
t_xmin=100
t_xmax=0"]
    end

    subgraph UPDATE["UPDATE (XID 200)"]
        T2["Tuple v1
t_xmin=100
t_xmax=200"]
        T3["Tuple v2
t_xmin=200
t_xmax=0"]
        T2 -->|"t_ctid"| T3
    end

    subgraph DELETE["DELETE (XID 300)"]
        T4["Tuple v2
t_xmin=200
t_xmax=300"]
    end

    T1 -->|"UPDATE"| T2
    T3 -->|"DELETE"| T4

    style T1 fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style T2 fill:#2E2E4F,stroke:#A63D40,color:#E8E4F0
    style T3 fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style T4 fill:#2E2E4F,stroke:#A63D40,color:#E8E4F0

Tuple Header Fields

Field	Purpose
`t_xmin`	Transaction ID that inserted this tuple version
`t_xmax`	Transaction ID that deleted/updated this version (0 if still live)
`t_cid`	Command ID within the transaction (distinguishes multiple operations)
`t_ctid`	Pointer to the next version (self-referencing if latest)

Isolation Levels

Read Committed

Default level. Takes a new snapshot per statement. Allows non-repeatable reads between statements in the same transaction.

New snapshot per statement

Repeatable Read

Takes one snapshot at transaction start. Prevents dirty reads and non-repeatable reads. Detects serialization anomalies at commit time.

Snapshot at TX start

Serializable (SSI)

True serializability via SSI (added in PG 9.1). Tracks rw-conflict dependencies and aborts transactions with dangerous orderings to prevent write skew.

rw-conflict tracking

Visibility Check

A tuple is visible if its t_xmin committed before the snapshot was taken AND its t_xmax is either 0, belongs to an aborted transaction, or committed after the snapshot. The CLOG (pg_xact/) tracks transaction states with 2 bits per XID. Hint bits on tuple headers cache commit/abort status to avoid repeated CLOG lookups.

06

Write-Ahead Logging (WAL)

The WAL guarantees durability: no dirty page is flushed to disk until the corresponding WAL record is on stable storage. WAL is stored in pg_wal/ as 16 MB segment files, each record identified by a 64-bit Log Sequence Number (LSN).

WAL Write Path & Crash Recovery

graph TD
    TX["Transaction
INSERT / UPDATE / DELETE"]
    XI["XLogInsert()
Append to WAL buffers"]
    XF["XLogFlush()
Flush to pg_wal/"]
    CM["COMMIT
Durable on disk"]

    subgraph Recovery["Crash Recovery"]
        SX["StartupXLOG()
Read pg_control"]
        RP["Find Redo Point
(last checkpoint LSN)"]
        RL["Replay WAL Records
(idempotent)"]
        DN["Consistent State
Normal ops begin"]
    end

    TX --> XI
    XI --> XF
    XF --> CM

    SX --> RP
    RP --> RL
    RL --> DN

    style TX fill:#7B4F9E,stroke:#9B6FBE,color:#F5F0E8
    style XI fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style XF fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style CM fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0
    style SX fill:#1C1C35,stroke:#A63D40,color:#E8E4F0
    style RP fill:#1C1C35,stroke:#A63D40,color:#E8E4F0
    style RL fill:#1C1C35,stroke:#A63D40,color:#E8E4F0
    style DN fill:#1E7268,stroke:#2A9D8F,color:#E8E4F0

Checkpoint Mechanism

A checkpoint flushes all dirty buffers from shared memory to disk and writes a checkpoint record to WAL. The redo point is the LSN where crash recovery would begin. Controlled by checkpoint_timeout (default 5 min) and max_wal_size (default 1 GB). The checkpointer spreads I/O using checkpoint_completion_target (default 0.9).

Full-Page Writes

After each checkpoint, the first modification to any page writes the entire 8 KB image to WAL (not just the delta). This protects against torn pages during OS crashes. Subsequent modifications before the next checkpoint only write deltas. Controlled by full_page_writes (default on -- should never be disabled).

WAL Levels

minimal -- only crash recovery. replica (default since PG 10) -- adds archiving and replication data. logical -- adds logical decoding information for logical replication.

07

Index Types

PostgreSQL supports six built-in index access methods, each with different data structures and query capabilities. The B-tree is the default and handles the vast majority of use cases.

Index Type Decision Map

graph TD
    Q["What query pattern?"]
    Q -->|"=, <, >, BETWEEN"| BT["B-tree
(Lehman-Yao)"]
    Q -->|"= only"| HA["Hash Index"]
    Q -->|"Geometric, KNN"| GI["GiST
(Generalized Search)"]
    Q -->|"Contains, FTS"| GIN["GIN
(Inverted Index)"]
    Q -->|"Quadtree, Trie"| SP["SP-GiST
(Space-Partitioned)"]
    Q -->|"Time-series range"| BR["BRIN
(Block Range)"]

    style Q fill:#7B4F9E,stroke:#9B6FBE,color:#F5F0E8
    style BT fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style HA fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0
    style GI fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style GIN fill:#1C1C35,stroke:#D4839E,color:#E8E4F0
    style SP fill:#1C1C35,stroke:#A63D40,color:#E8E4F0
    style BR fill:#1C1C35,stroke:#2D6A4F,color:#E8E4F0

B-tree

Lehman-Yao concurrent B-tree with right-link pointers. O(log n) lookups. Supports range queries, sorting, index-only scans, and deduplication (PG 13+).

Default

Hash

Bucket-based index for equality-only comparisons. WAL-logged since PG 10. Faster than B-tree for pure equality on high-cardinality columns but cannot sort or do range queries.

= only

GiST

Balanced tree framework supporting arbitrary data types via operator classes. "Lossy" indexes with false positives filtered by the executor. Used for geometry, FTS, range types, KNN.

Extensible

SP-GiST

Framework for non-balanced, space-partitioning structures: quadtrees, k-d trees, radix tries. Best for data with natural clustering but uneven distribution (IPs, phone numbers).

Space-Partitioned

GIN

Inverted index mapping element values to row sets. Entry tree (B-tree of keys) + posting lists/trees. Ideal for tsvector, arrays, JSONB containment, trigram similarity.

Inverted

BRIN

Stores min/max summaries for ranges of physical pages. 1000x smaller than B-tree on correlated columns. Only effective when values correlate with physical row order.

Compact

08

Vacuum & Autovacuum

MVCC never removes old tuple versions in-place. Dead tuples accumulate from DELETEs and UPDATEs and must be reclaimed by VACUUM. Without it, tables bloat and eventually suffer catastrophic XID wraparound.

Regular VACUUM Three-Phase Process

graph LR
    subgraph Phase1["Phase 1: Heap Scan"]
        HS["Scan heap pages"]
        VM["Check Visibility Map
Skip all-visible pages"]
        DT["Collect dead TIDs
into array"]
    end

    subgraph Phase2["Phase 2: Index Vacuum"]
        IS["Scan each index"]
        RE["Remove entries
pointing to dead TIDs"]
    end

    subgraph Phase3["Phase 3: Heap Vacuum"]
        FS["Mark dead slots
in Free Space Map"]
        UV["Update Visibility Map
Set all-visible bits"]
    end

    Phase1 --> Phase2
    Phase2 --> Phase3

    style HS fill:#1C1C35,stroke:#7B4F9E,color:#E8E4F0
    style VM fill:#1C1C35,stroke:#7B4F9E,color:#E8E4F0
    style DT fill:#1C1C35,stroke:#7B4F9E,color:#E8E4F0
    style IS fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style RE fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style FS fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style UV fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0

Autovacuum Thresholds

The autovacuum launcher wakes every autovacuum_naptime (default 1 min) and checks pg_stat_all_tables for tables exceeding the dead tuple threshold: dead_tuples > threshold + scale_factor * live_tuples (defaults: 50 + 20%). Up to autovacuum_max_workers (default 3) workers run concurrently, throttled by cost-based delay to limit production impact.

XID Wraparound Prevention

XIDs are 32-bit (~4 billion). Without freezing, after ~2 billion transactions old XIDs would appear "in the future" and tuples would vanish. VACUUM freezes tuples by replacing t_xmin with FrozenTransactionId (XID 2). Key thresholds:

Parameter	Default	Purpose
`vacuum_freeze_min_age`	50M txns	Minimum tuple age before freezing
`vacuum_freeze_table_age`	150M txns	Triggers aggressive full-table vacuum scan
`autovacuum_freeze_max_age`	200M txns	Forces anti-wraparound vacuum regardless of dead tuples

VACUUM FULL

Rewrites the entire table into a new file, compacting and eliminating all dead space. Requires an ACCESS EXCLUSIVE lock (blocks all reads and writes). Reclaims disk space to the OS, unlike regular VACUUM which only marks space for internal reuse. Use sparingly.

09

Replication

PostgreSQL supports two replication paradigms: physical (byte-level WAL streaming for exact cluster copies) and logical (row-level change streaming for selective table replication across versions).

Physical Streaming Replication

graph LR
    subgraph Primary["Primary Server"]
        BE["Backend
Writes data"]
        WAL1["pg_wal/
WAL segments"]
        WS["WAL Sender"]
    end

    subgraph Standby["Standby Server"]
        WR["WAL Receiver"]
        WAL2["pg_wal/
WAL segments"]
        SU["Startup Process
Replay WAL"]
        RO["Read-Only
Queries (Hot Standby)"]
    end

    BE --> WAL1
    WAL1 --> WS
    WS -->|"XLogData messages"| WR
    WR --> WAL2
    WAL2 --> SU
    SU --> RO

    style BE fill:#7B4F9E,stroke:#9B6FBE,color:#F5F0E8
    style WS fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style WR fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style WAL1 fill:#2E2E4F,stroke:#E6A817,color:#E8E4F0
    style WAL2 fill:#2E2E4F,stroke:#E6A817,color:#E8E4F0
    style SU fill:#1C1C35,stroke:#A63D40,color:#E8E4F0
    style RO fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0

Logical Replication Data Flow

graph LR
    subgraph Pub["Publisher"]
        WS2["WAL Sender"]
        LD["Logical Decoder
pgoutput plugin"]
        RB["Reorder Buffer
Reassemble TXs"]
    end

    subgraph Sub["Subscriber"]
        AW["Apply Worker
Map + apply changes"]
        TS["Table Sync Worker
Initial COPY"]
    end

    WS2 --> LD
    LD --> RB
    RB -->|"Streaming protocol"| AW
    TS -->|"Initial sync"| AW

    style WS2 fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style LD fill:#1C1C35,stroke:#7B4F9E,color:#E8E4F0
    style RB fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style AW fill:#1C1C35,stroke:#5B8FB9,color:#E8E4F0
    style TS fill:#1C1C35,stroke:#D4839E,color:#E8E4F0

Physical vs. Logical Comparison

Aspect	Physical	Logical
Scope	Entire cluster (byte-for-byte copy)	Selected tables / columns
Cross-version	Same major version only	Cross-version supported
Subscriber writes	Read-only standby	Writable subscriber
WAL level	replica	logical
Setup	Simpler (pg_basebackup)	Publication + subscription

Replication Slots

Slots prevent WAL recycling for disconnected consumers. Physical slots track LSN position; logical slots also track the oldest XID needed for catalog lookups. Failover slots (PG 17+) can be synced to standbys for seamless promotion.

10

Extension System & Hooks

PostgreSQL's extension system allows adding new data types, functions, operators, index methods, and languages without modifying core code. The hook architecture provides ~30 interception points for monitoring, security, and query transformation.

Hook Architecture & Stacking

graph TD
    subgraph Hooks["Hook Categories"]
        QH["Query Hooks
post_parse_analyze
planner_hook
join_search_hook"]
        EH["Executor Hooks
ExecutorStart/Run/
Finish/End"]
        SH["Security Hooks
ClientAuthentication
object_access_hook"]
        IH["Init Hooks
shmem_startup
emit_log_hook"]
    end

    subgraph Stack["Hook Stacking"]
        E1["Extension A
_PG_init()"]
        E2["Extension B
_PG_init()"]
        NULL["NULL (default)"]
    end

    E2 -->|"saved prev"| E1
    E1 -->|"saved prev"| NULL

    style QH fill:#1C1C35,stroke:#7B4F9E,color:#E8E4F0
    style EH fill:#1C1C35,stroke:#E6A817,color:#E8E4F0
    style SH fill:#1C1C35,stroke:#A63D40,color:#E8E4F0
    style IH fill:#1C1C35,stroke:#2A9D8F,color:#E8E4F0
    style E1 fill:#252540,stroke:#5B8FB9,color:#E8E4F0
    style E2 fill:#252540,stroke:#D4839E,color:#E8E4F0
    style NULL fill:#252540,stroke:#B0A8C0,color:#E8E4F0

Notable Extensions Using Hooks

pg_stat_statements

Uses all four executor hooks to track query elapsed time, call counts, and I/O statistics across all backends.

Executor hooks

TimescaleDB

Uses planner_hook for hypertable chunk rewriting, post_parse_analyze_hook for query transformation, and ProcessUtility_hook for DDL interception.

Planner + DDL

pgAudit

Uses executor and utility hooks for comprehensive audit logging of all database operations including DDL and DML.

Security audit

auto_explain

Uses executor hooks to automatically log query execution plans for statements exceeding a configurable duration threshold.

Diagnostics

Other Extension Mechanisms

Beyond hooks, PostgreSQL offers custom access methods (CREATE ACCESS METHOD), custom data types with operator classes, Foreign Data Wrappers for external data sources, custom background workers, custom WAL resource managers (PG 15+), and pluggable storage engines via the Table Access Method API (PG 12+).

11

Glossary & Acronyms

MVCC Multi-Version Concurrency Control

WAL Write-Ahead Logging

LSN Log Sequence Number

XID Transaction ID

CLOG Commit Log (pg_xact/)

SSI Serializable Snapshot Isolation

OID Object Identifier

TID Tuple Identifier

GEQO Genetic Query Optimizer

GiST Generalized Search Tree

GIN Generalized Inverted Index

BRIN Block Range Index

SP-GiST Space-Partitioned Generalized Search Tree

FPW Full-Page Writes

VM Visibility Map

FSM Free Space Map

PITR Point-In-Time Recovery

FDW Foreign Data Wrapper

IK Inverse Kinematics

KNN K-Nearest Neighbors

LWLock Lightweight Lock

PGPROC Process Information Array