The hash index provides O(1) equality lookups using linear hashing, a
technique that expands the hash table incrementally (one bucket at a time)
rather than doubling all at once. Since PostgreSQL 10, hash indexes are
WAL-logged and crash-safe. They excel at single-column equality predicates
(WHERE col = value) but do not support range scans, ordering, or
multi-column indexes in practice.
| File | Purpose |
|---|---|
src/backend/access/hash/hash.c |
Entry points: hashhandler(), hashbuild(), hashinsert(), hashgettuple() |
src/backend/access/hash/hashsearch.c |
_hash_first(), _hash_next() – scan within a bucket |
src/backend/access/hash/hashinsert.c |
_hash_doinsert() – insert into the correct bucket |
src/backend/access/hash/hashpage.c |
Bucket management, split operations, page allocation |
src/backend/access/hash/hashovfl.c |
Overflow page allocation and management |
src/backend/access/hash/hashsort.c |
Bulk loading support using tuplesort |
src/backend/access/hash/hashfunc.c |
Built-in hash functions for standard types |
src/backend/access/hash/hashutil.c |
Hash value computation, bucket mapping |
src/backend/access/hash/hash_xlog.c |
WAL redo routines |
src/backend/access/hash/README |
Design document |
src/include/access/hash.h |
HashPageOpaqueData, HashMetaPageData, macros |
Unlike traditional hash tables that double in size, linear hashing splits one bucket at a time in round-robin order:
2^N buckets at any point (the “low mask”).hashm_lowmask / hashm_highmask tracks which
buckets have been split in the current round.original + 2^N.2^N, a new round begins with
2^(N+1) buckets.Bucket mapping for a hash value h:
bucket = h & highmask;
if (bucket > max_bucket)
bucket = h & lowmask;
Meta page (block 0)
+-------------------------------------------+
| HashMetaPageData |
| hashm_nmaps, hashm_ntuples, hashm_ffactor |
| hashm_maxbucket, hashm_highmask, |
| hashm_lowmask, hashm_spares[] |
+-------------------------------------------+
Bitmap pages (blocks 1..N)
+-------------------------------------------+
| Track which overflow pages are in use |
+-------------------------------------------+
Bucket pages
+-------------------------------------------+
| Primary bucket page for bucket K |
| -> overflow page -> overflow page -> ... |
+-------------------------------------------+
Each bucket has a primary page and zero or more overflow pages linked
through hasho_nextblkno in HashPageOpaqueData.
_hash_doinsert(rel, itup)
-> _hash_hashkey() // compute hash value
-> _hash_getbucketbuf() // map hash to bucket, lock primary page
-> _hash_pgaddtup() // add tuple to bucket page
if no space:
-> _hash_addovflpage() // allocate and link overflow page
-> if too many tuples:
-> _hash_expandtable() // split one bucket (linear hashing step)
_hash_first(scan)
-> compute hash value from scan key
-> map to bucket number
-> lock and read primary bucket page
-> scan all tuples on page, check for match
-> follow overflow chain via hasho_nextblkno
-> _hash_next() continues through overflow pages
// src/include/access/hash.h
typedef struct HashMetaPageData
{
uint32 hashm_magic;
uint32 hashm_version;
double hashm_ntuples; // number of tuples in index
uint16 hashm_ffactor; // fill factor (tuples per bucket target)
uint16 hashm_bsize; // bucket page size
uint16 hashm_bmsize; // bitmap page size
uint16 hashm_bmshift; // shift for bitmap page addressing
uint32 hashm_maxbucket; // highest bucket number allocated
uint32 hashm_highmask; // mask for current round
uint32 hashm_lowmask; // mask for previous round
uint32 hashm_ovflpoint; // current overflow split point
uint32 hashm_firstfree; // first free overflow page
uint32 hashm_nmaps; // number of bitmap pages
RegProcedure hashm_procid; // hash function OID
uint32 hashm_spares[HASH_MAX_SPLITPOINTS]; // spare pages per split point
BlockNumber hashm_mapp[HASH_MAX_BITMAPS]; // bitmap page block numbers
} HashMetaPageData;
// src/include/access/hash.h
typedef struct HashPageOpaqueData
{
BlockNumber hasho_prevblkno; // previous page in bucket chain
BlockNumber hasho_nextblkno; // next page in bucket chain
Bucket hasho_bucket; // bucket number this page belongs to
uint16 hasho_flag; // LH_OVERFLOW_PAGE, LH_BUCKET_PAGE, LH_BITMAP_PAGE, LH_META_PAGE
uint16 hasho_page_id; // for identification (HASHO_PAGE_ID)
} HashPageOpaqueData;
hash(key) = 0x7A3F
|
v
bucket = 0x7A3F & highmask = 5
|
v
+------------------+ +------------------+ +------------------+
| Bucket page 5 | --> | Overflow page | --> | Overflow page |
| (primary) | | | | |
| hasho_bucket=5 | | hasho_bucket=5 | | hasho_bucket=5 |
| hasho_flag= | | hasho_flag= | | hasho_nextblkno= |
| LH_BUCKET_PAGE | | LH_OVERFLOW | | InvalidBlock |
| entries... | | entries... | | entries... |
+------------------+ +------------------+ +------------------+
When _hash_expandtable() is called:
hashm_maxbucket.hash & new_highmask:
This is a heavyweight operation that holds locks on the splitting bucket for the duration, but only affects one bucket – all other buckets remain readable and writable.
= operator. No range
scans, no ORDER BY, no pattern matching.amcanunique = false.Hash indexes are beneficial when:
= lookups.Since PostgreSQL 10+, hash indexes are crash-safe (WAL-logged). Before that, they were considered unsafe and rarely recommended.
hash_xlog.c provides redo support for inserts, splits, overflow
page operations, and bitmap changes.amcanhash = true and are considered for
equality predicates. The planner uses hashcostestimate() for costing.