PostgreSQL Internals Deep Dive

A wizard-level exploration of the PostgreSQL codebase

Posted by ikouchiha47 on February 19, 2026

A wizard-level exploration of the PostgreSQL codebase, from SQL string to disk blocks and back.

How to Read This Book

Each chapter follows a zoom-in / zoom-out pattern:

  1. Chapter index — bird’s-eye overview, key concepts, how the subsystem fits into PG as a whole
  2. Topic pages — deep dives into specific mechanisms, with source file references (file:line), struct layouts, and diagrams
  3. Connections section at the bottom of every page — links back out to related subsystems

You can read linearly or jump to any topic. The dependency arrows in each chapter index will guide you.

Prerequisites

  • Comfortable reading C code
  • Basic understanding of operating systems (processes, virtual memory, file I/O)
  • Familiarity with SQL and relational databases
  • A cloned PostgreSQL source tree (this book references src/ paths throughout)

Acknowledgments

Built by studying the PostgreSQL source code, READMEs in src/backend/*/README, and the following references:


Table of Contents

Architecture

  1. Memory Layout
  2. Process Model
  3. Query Lifecycle

Storage Engine

  1. Async I/O
  2. Buffer Manager
  3. Free Space Map
  4. Page Layout
  5. smgr and Forks
  6. Visibility Map

Access Methods

  1. BRIN Index
  2. B-tree Index
  3. GIN Index
  4. GiST Index
  5. Hash Index
  6. Heap Access Method
  7. SP-GiST Index
  8. Table AM API

Transactions & MVCC

  1. CLOG and Subtransactions
  2. Isolation Levels
  3. MVCC and Tuple Versioning
  4. Snapshots
  5. Serializable Snapshot Isolation
  6. Two-Phase Commit

Write-Ahead Logging

  1. Checkpoints
  2. Recovery
  3. WAL for Extensions
  4. WAL Internals

Locking

  1. Deadlock Detection
  2. Heavyweight Locks
  3. Lightweight Locks
  4. Predicate Locks
  5. Spinlocks

Parsing & Rewriting

  1. Lexer & Parser
  2. Rewrite Rules
  3. Semantic Analysis

Query Optimizer

  1. Cost Model
  2. GEQO
  3. Join Ordering
  4. Path Generation
  5. Plan Creation
  6. Preprocessing

Executor

  1. Aggregation
  2. Expression Evaluation
  3. Join Nodes
  4. Parallel Query
  5. Scan Nodes
  6. Sort and Materialize
  7. Volcano Iterator Model

Caches

  1. Catalog Cache
  2. Invalidation
  3. Plan Cache
  4. Relation Cache
  5. Type Cache

Memory Management

  1. Dynamic Shared Areas
  2. Memory Contexts
  3. Resource Owners

IPC

  1. Latches and Wait Events
  2. Message Queues
  3. ProcArray and PGPROC
  4. Shared Memory

Replication

  1. Conflict Resolution
  2. Logical Replication
  3. Streaming Replication
  4. Synchronous Replication

Statistics & Monitoring

  1. Activity Monitoring
  2. Extended Statistics
  3. pg_statistic and Single-Column Statistics

Platform Layer

  1. Atomic Operations and Memory Barriers
  2. I/O Backends
  3. Portability and OS Abstraction
  4. SIMD, CRC, and Hardware Acceleration

Extensions

  1. Background Workers
  2. Custom Access Methods
  3. Foreign Data Wrappers
  4. Hooks