PostgreSQL runs each backend as an independent process with private caches. When one backend modifies a system catalog (via DDL or catalog DML), every other backend must eventually learn that its cached data may be stale. The shared invalidation (sinval) system provides this cross-backend coherence through a shared-memory message queue and a per-backend dispatch layer.
The invalidation system has two layers:
inval.c – The per-backend invalidation dispatcher. It accumulates invalidation events during a transaction, processes them locally at command boundaries, broadcasts them to other backends on commit, and dispatches incoming messages to registered callback functions.
sinval.c / sinvaladt.c – The shared invalidation queue in shared memory. A circular buffer that all backends write to (on commit) and read from (opportunistically). Backends that fall too far behind are signaled to catch up.
Together, these ensure that every backend’s caches eventually converge to a consistent view, while allowing reads to proceed without locking.
| File | Role |
|---|---|
src/backend/utils/cache/inval.c |
Per-backend invalidation dispatcher, callback management, transactional queuing |
src/include/utils/inval.h |
Public API for invalidation: CacheInvalidateHeapTuple(), callback registration |
src/backend/storage/ipc/sinval.c |
Send/receive shared invalidation messages |
src/backend/storage/ipc/sinvaladt.c |
Shared memory ring buffer implementation |
src/include/storage/sinval.h |
SharedInvalidationMessage union, message type definitions |
When a catalog tuple is inserted, updated, or deleted via CatalogTupleInsert(), CatalogTupleUpdate(), or CatalogTupleDelete(), the catalog access code calls CacheInvalidateHeapTuple(). This function:
PrepareToInvalidateCacheTuple() from catcache.c, which determines which catcaches could contain entries for this tuple and computes the hash value for each.SharedInvalCatcacheMsg for each affected catcache.pg_class, pg_attribute, pg_index, or pg_constraint (for foreign keys), also queues a SharedInvalRelcacheMsg for the affected relation.SharedInvalSnapshotMsg.These messages are not sent immediately. They are stored in per-transaction arrays in TopTransactionContext.
At CommandCounterIncrement(), CommandEndInvalidationMessages() is called. It:
wal_level = logical, for logical decoding of in-progress transactions).The messages remain in the transaction’s pending lists for eventual broadcast.
At transaction commit, AtEOXact_Inval(true) sends all accumulated messages to the shared invalidation queue via SendSharedInvalidMessages(). On abort, AtEOXact_Inval(false) processes the messages locally (to undo any catalog state the aborted transaction loaded) but does not broadcast them.
Other backends call AcceptInvalidationMessages() at safe points – typically at the start of each transaction, after acquiring locks, and during ProcessCatchupInterrupt(). This function calls ReceiveSharedInvalidMessages(), which reads messages from the shared queue and dispatches each one to LocalExecuteInvalidationMessage().
LocalExecuteInvalidationMessage() examines the message type and calls the appropriate handler:
| Message Type ID | Struct | Handler |
|---|---|---|
| >= 0 (catcache ID) | SharedInvalCatcacheMsg |
SysCacheInvalidate(id, hashValue) |
-1 (SHAREDINVALCATALOG_ID) |
SharedInvalCatalogMsg |
CatalogCacheFlushCatalog(catId) |
-2 (SHAREDINVALRELCACHE_ID) |
SharedInvalRelcacheMsg |
RelationCacheInvalidateEntry(relId) |
-3 (SHAREDINVALSMGR_ID) |
SharedInvalSmgrMsg |
smgrcloserellocator(rlocator) |
-4 (SHAREDINVALRELMAP_ID) |
SharedInvalRelmapMsg |
RelationMapInvalidate(dbId) |
-5 (SHAREDINVALSNAPSHOT_ID) |
SharedInvalSnapshotMsg |
InvalidateCatalogSnapshot() |
-6 (SHAREDINVALRELSYNC_ID) |
SharedInvalRelSyncMsg |
CallRelSyncCallbacks(relid) |
After processing all messages, the system fires any registered callbacks.
sequenceDiagram
participant BackendA as Backend A
participant InvalA as inval.c (A)
participant SinvalQ as Sinval Queue<br/>(shared memory)
participant InvalB as inval.c (B)
participant BackendB as Backend B
BackendA->>InvalA: ALTER TABLE foo ...
InvalA->>InvalA: Queue catcache + relcache msgs
BackendA->>InvalA: CommandCounterIncrement()
InvalA->>InvalA: Process locally (own caches)
BackendA->>InvalA: COMMIT
InvalA->>SinvalQ: SendSharedInvalidMessages()
Note over BackendB: Next transaction start
BackendB->>InvalB: AcceptInvalidationMessages()
InvalB->>SinvalQ: ReceiveSharedInvalidMessages()
SinvalQ-->>InvalB: Messages for Backend B
InvalB->>InvalB: LocalExecuteInvalidationMessage()
InvalB->>InvalB: Fire syscache/relcache callbacks
InvalB->>BackendB: Caches updated
The queue is a circular buffer in shared memory, sized at startup. Each backend maintains a read pointer (nextMsgNum) tracking how far it has read. Key properties:
sinvaladt.c sends a PROCSIG_CATCHUP_INTERRUPT signal to the lagging backend. The handler sets catchupInterruptPending = true, and at the next safe point the backend processes all pending messages.resetFunction callback, which calls InvalidateSystemCaches() to flush everything.A backend stuck in a long-running query (or idle in transaction) blocks the sinval queue from advancing past its read pointer. This can force the queue to grow or, in extreme cases, trigger PROCSIG_CATCHUP_INTERRUPT processing that interrupts client I/O.
Higher-level caches (plan cache, type cache, etc.) register callbacks to be notified of invalidation events:
/* Register a per-catcache callback */
CacheRegisterSyscacheCallback(PROCOID, PlanCacheObjectCallback, (Datum) 0);
/* Register a relcache callback */
CacheRegisterRelcacheCallback(PlanCacheRelCallback, (Datum) 0);
When inval.c processes a catcache invalidation for PROCOID, it calls CallSyscacheCallbacks(PROCOID, hashvalue), which iterates over all registered callbacks for that cache ID.
There is a hard limit of 64 syscache callbacks and 64 relcache callbacks. This is sufficient because callbacks are registered once per subsystem, not per cached entry.
A union type that fits all message variants in a compact format:
SharedInvalidationMessage (union)
+-- id message type (first byte, shared by all variants)
|
+-- cc (SharedInvalCatcacheMsg)
| id >= 0 catcache ID
| dbId database OID (0 for shared catalogs)
| hashValue hash of the invalidated tuple's keys
|
+-- cat (SharedInvalCatalogMsg)
| id = -1 flush entire catalog
| dbId, catId which catalog in which database
|
+-- rc (SharedInvalRelcacheMsg)
| id = -2 relcache invalidation
| dbId, relId which relation (0 = all)
|
+-- sm (SharedInvalSmgrMsg)
| id = -3 storage manager file invalidation
| rlocator RelFileLocator
|
+-- rm (SharedInvalRelmapMsg)
| id = -4 relation map invalidation
| dbId
|
+-- sn (SharedInvalSnapshotMsg)
| id = -5 snapshot invalidation
| dbId, relId
|
+-- rs (SharedInvalRelSyncMsg)
id = -6 replication relation sync
dbId, relid
Per-subtransaction control structure that tracks ranges of messages in the pending arrays:
TransInvalidationInfo
+-- parent enclosing subtransaction's info
+-- catcache messages range [start..end) in the catcache msg array
+-- relcache messages range [start..end) in the relcache msg array
On subtransaction commit, the child’s ranges are absorbed into the parent. On abort, the child’s messages are processed locally and discarded.
Most invalidation messages are transactional – queued during the transaction and only broadcast on commit. Two exceptions:
CacheInvalidateSmgr): Sent immediately when a physical file is created or removed. Other backends must stop caching file descriptors for the old file.CacheInvalidateRelmap): Sent immediately when the pg_filenode.map is updated. Required because relmap changes happen outside normal catalog update paths.These non-transactional messages use SendSharedInvalidMessages() directly, bypassing the per-transaction queuing.
Some catalog updates (e.g., updating pg_class.relfrozenxid during VACUUM) are done as inplace heap updates that do not go through normal transactional machinery. The invalidation system handles these via PreInplace_Inval(), AtInplace_Inval(), and ForgetInplace_Inval(), which manage a separate invalidation context that sends messages immediately within the inplace update’s critical section.
When compiled with DISCARD_CACHES_ENABLED (assert-enabled builds), the debug_discard_caches GUC causes all caches to be flushed at every possible invalidation point. This is useful for testing that code correctly handles cache invalidation, but has extreme performance impact. Values:
| Value | Behavior |
|---|---|
| 0 | Normal (default in production builds) |
| 1 | Discard caches at every AcceptInvalidationMessages() |
| 3 | Also discard caches recursively during cache rebuilds |
| 5 | Maximum aggressiveness |
CatCacheInvalidate() is called for each matching catcache.RelationCacheInvalidateEntry(). Init file deletion is coordinated here.pg_type and pg_opclass changes.AccessExclusiveLock to ensure invalidation is processed before any concurrent backend can access the modified object.PROCSIG_CATCHUP_INTERRUPT uses the signal infrastructure.