Database Partitioning with LIST and HASH: A Real-World Example

S

Sreeraj Rajan

SDE4

January 23, 20265 min read

Database Partitioning with LIST and HASH: A Real-World Example

PostgreSQL performance issues often surface long before the system fully breaks, queries slow down, updates begin timing out, and background processes lag under load. That’s exactly what we encountered with a transaction-processing table that had quietly grown into a 123GB monolith. Over time, this single table became the central bottleneck for updates, categorization workflows, and downstream services.

The problem wasn’t inefficient code or missing indexes; it was the mismatch between a massive, write-heavy table and an access pattern dominated by tenant-specific traffic and frequent point lookups on product_account_id and ref_id. Vertical scaling offered only temporary relief because the underlying structure remained unchanged.

To restore predictable performance, we turned to PostgreSQL’s native partitioning features, using LIST partitioning to isolate the high-volume tenant and HASH partitioning to distribute ref_id-based writes. This redesign aligned the physical table layout with real workload patterns, similar to how we approach high-performance backend systems at Procedure, all without modifying any application queries.

What followed was a staged redesign of the table - one that combined native PostgreSQL features with careful operational planning to deliver a safer, more scalable architecture.

When PostgreSQL Tables Become Bottlenecks

As the system scaled, we discovered that nearly 113GB of the 123GB table belonged to a single tenant - a 92% skew. With most activity concentrated in this segment, lock pressure increased, and previously fast queries began slowing.

This skew created cascading issues: ref_id-based updates queued behind locks, background processes lagged, and smaller tenants faced spillover latency. The issue wasn’t storage capacity at all; it was structural. The table’s physical layout no longer matched how the system actually accessed data, which is where partitioning became the clear next step.

The path forward was clear: the table needed physical separation that matched real access patterns, allowing locks, scans, and updates to operate on isolated surfaces rather than a single shared structure.

Choosing the Right Partitioning Strategy

The next step was choosing a partitioning strategy that aligned with real query behavior while keeping the application queries unchanged - PostgreSQL needed to handle all routing and pruning internally.

We evaluated several approaches. Time-based partitioning didn’t fit because access patterns relied on point lookups. App-level sharding required significant code changes. And keeping a single large table, even with improved indexing, would not resolve lock contention.

Query patterns made the direction clear: most operations filtered on product_account_id, and nearly all updates depended on ref_id. This naturally mapped to a two-level structure: LIST partitioning to isolate the high-volume tenant and HASH partitioning to distribute ref_id-driven writes within that tenant.

Range partitioning was evaluated as well, but it offered no practical benefit because the workload lacked any continuous dimension - queries were nearly always equality-based.

This design aligned the physical layout with how the system actually processed data, enabling aggressive pruning, reducing lock contention, and improving update and lookup performance without altering any upstream queries.

Implementing Tenant Isolation in PostgreSQL Safely

Designing the partitioning approach was only half the solution; the real complexity lay in implementing it safely on a live, high-traffic system. Because the application layer couldn’t change, the migration had to rely entirely on PostgreSQL’s internal routing, pruning, and constraint guarantees. This meant building the new structure in parallel, validating it thoroughly, and performing a cutover with a reversible plan.

Prepare the schema for partitioning
Build the new hierarchy alongside the existing table
Validate pruning and distribution with real data slices
Plan for a reversible, low-risk cutover

This staged, safety-first approach allowed us to adopt LIST + HASH partitioning without downtime, without query changes, and with full control over failure scenarios.

Tangible Performance Gains After Partitioning

The new LIST + HASH structure produced immediate, measurable improvements by isolating the high-volume tenant and distributing ref_id operations across eight hash buckets. PostgreSQL could finally operate on smaller, independent segments instead of one overloaded structure. These improvements were measured under real production load, not synthetic tests, making the gains directly attributable to partition pruning and reduced contention.

1. Update operations dropped from 30-60 seconds to 5-7 seconds

Before partitioning, updates stalled behind lock queues and long index scans. With pruning, each update targeted only the correct tenant partition and hash bucket, dramatically reducing workload.

2. Point lookups became nearly 10× faster

Ref_id lookups no longer scanned large index structures; each request touched only one of the eight sub-partitions, improving cache locality and reducing index depth.

3. Lock contention flattened across the system

Heavy activity from the largest tenant no longer cascaded across other workloads. LIST isolation and hash distribution spread writes naturally across independent partitions.

4. Smaller tenants regained predictable performance

Their queries no longer competed with the high-volume tenant, restoring stability during peak load.

5. No application changes were required

The application continued using the same queries, with PostgreSQL handling all routing and pruning internally.

The result was a shift from a single point of friction to a scalable, predictable structure ready for continued growth.

PostgreSQL Partitioning Best Practices and Lessons Learned

Treat partitioning as an architectural adjustment, not a quick optimization. The gains here came from aligning the table’s structure with actual access patterns and following a disciplined, numbers-driven implementation. These principles guided the success of the LIST + HASH design.

1. Partition based on the columns your queries actually filter on

Use the keys your workload consistently filters on; if queries rely on composite semantics, ensure the partition key reflects the primary filtering dimension rather than the entire index definition. Equality-based predicates enable pruning and prevent unnecessary scans.

2. Start with a practical number of hash buckets

Beginning with 4-8 hash partitions is enough for most write-heavy workloads. Eight provided meaningful distribution without adding operational overhead.

3. Guarantee consistent and predictable pruning

Pruning drives most of the performance wins. Ensure:

• Non-null partition keys

• Consistent query predicates

• Equality-based filters

4. Promote tenants before they become bottlenecks

Move high-volume tenants into dedicated partitions once they cross thresholds for data size, throughput, or latency.

5. Plan for the operational footprint of large-table migrations

Expect temporary storage requirements near 2× the table’s size to accommodate staging and per-partition indexes. Index overhead alone can contribute 30–40% of a partition’s footprint.

6. Monitor partitions as independent units

Each child table maintains its own statistics and growth patterns. Regular ANALYZE, VACUUM, and index checks keep performance predictable.

7. Avoid unnecessary complexity early on

Use only the layers required by your workload. LIST + HASH provided the needed balance without adding long-term maintenance burden.

Partitioning is most effective when it amplifies the access patterns you already have, rather than forcing new ones.

Why PostgreSQL Partitioning Becomes Essential at Scale

This migration underscored a clear principle: predictable performance depends on a schema that reflects real workload patterns. Partitioning realigned the table with how the application actually operated, replacing a fragile bottleneck with a stable, scalable foundation.

By isolating the high-volume tenant and distributing ref_id activity across smaller partition units, PostgreSQL delivered consistent behavior under load without requiring any application changes.

For teams managing large or unevenly growing datasets, this approach provides a practical path forward. With the right partition keys, an appropriate number of buckets, and a safe migration plan, PostgreSQL partitioning becomes an architectural upgrade, not just a performance optimization.

As data volume and tenant diversity grow, these architectural decisions compound, turning partitioning from a tactical fix into a long-term performance advantage.

If you found this post valuable, I’d love to hear your thoughts. Let’s connect and continue the conversation on LinkedIn.

S

Sreeraj Rajan

SDE4

Sreeraj Rajan is a Software Engineer with a strong interest in SRE and DevOps, and over 6 years of experience building and operating production systems. He has built and deployed services in Python, Java, and TypeScript, and worked on setting up a HIPAA-compliant Kubernetes cluster using Terraform. Sreeraj enjoys working across full-stack and infrastructure domains, with a focus on reliability, scalability, and operational excellence.

PostgreSQL performance issues often surface long before the system fully breaks, queries slow down, updates begin timing out, and background processes lag under load. That’s exactly what we encountered with a transaction-processing table that had quietly grown into a 123GB monolith. Over time, this single table became the central bottleneck for updates, categorization workflows, and downstream services.

The problem wasn’t inefficient code or missing indexes; it was the mismatch between a massive, write-heavy table and an access pattern dominated by tenant-specific traffic and frequent point lookups on product_account_id and ref_id. Vertical scaling offered only temporary relief because the underlying structure remained unchanged.

To restore predictable performance, we turned to PostgreSQL’s native partitioning features, using LIST partitioning to isolate the high-volume tenant and HASH partitioning to distribute ref_id-based writes. This redesign aligned the physical table layout with real workload patterns, similar to how we approach high-performance backend systems at Procedure, all without modifying any application queries.

What followed was a staged redesign of the table - one that combined native PostgreSQL features with careful operational planning to deliver a safer, more scalable architecture.

When PostgreSQL Tables Become Bottlenecks

As the system scaled, we discovered that nearly 113GB of the 123GB table belonged to a single tenant - a 92% skew. With most activity concentrated in this segment, lock pressure increased, and previously fast queries began slowing.

This skew created cascading issues: ref_id-based updates queued behind locks, background processes lagged, and smaller tenants faced spillover latency. The issue wasn’t storage capacity at all; it was structural. The table’s physical layout no longer matched how the system actually accessed data, which is where partitioning became the clear next step.

The path forward was clear: the table needed physical separation that matched real access patterns, allowing locks, scans, and updates to operate on isolated surfaces rather than a single shared structure.

Choosing the Right Partitioning Strategy

The next step was choosing a partitioning strategy that aligned with real query behavior while keeping the application queries unchanged - PostgreSQL needed to handle all routing and pruning internally.

We evaluated several approaches. Time-based partitioning didn’t fit because access patterns relied on point lookups. App-level sharding required significant code changes. And keeping a single large table, even with improved indexing, would not resolve lock contention.

Query patterns made the direction clear: most operations filtered on product_account_id, and nearly all updates depended on ref_id. This naturally mapped to a two-level structure: LIST partitioning to isolate the high-volume tenant and HASH partitioning to distribute ref_id-driven writes within that tenant.

Range partitioning was evaluated as well, but it offered no practical benefit because the workload lacked any continuous dimension - queries were nearly always equality-based.

This design aligned the physical layout with how the system actually processed data, enabling aggressive pruning, reducing lock contention, and improving update and lookup performance without altering any upstream queries.

Implementing Tenant Isolation in PostgreSQL Safely

Designing the partitioning approach was only half the solution; the real complexity lay in implementing it safely on a live, high-traffic system. Because the application layer couldn’t change, the migration had to rely entirely on PostgreSQL’s internal routing, pruning, and constraint guarantees. This meant building the new structure in parallel, validating it thoroughly, and performing a cutover with a reversible plan.

Prepare the schema for partitioning
Build the new hierarchy alongside the existing table
Validate pruning and distribution with real data slices
Plan for a reversible, low-risk cutover

This staged, safety-first approach allowed us to adopt LIST + HASH partitioning without downtime, without query changes, and with full control over failure scenarios.

Tangible Performance Gains After Partitioning

The new LIST + HASH structure produced immediate, measurable improvements by isolating the high-volume tenant and distributing ref_id operations across eight hash buckets. PostgreSQL could finally operate on smaller, independent segments instead of one overloaded structure. These improvements were measured under real production load, not synthetic tests, making the gains directly attributable to partition pruning and reduced contention.

1. Update operations dropped from 30-60 seconds to 5-7 seconds

Before partitioning, updates stalled behind lock queues and long index scans. With pruning, each update targeted only the correct tenant partition and hash bucket, dramatically reducing workload.

2. Point lookups became nearly 10× faster

Ref_id lookups no longer scanned large index structures; each request touched only one of the eight sub-partitions, improving cache locality and reducing index depth.

3. Lock contention flattened across the system

Heavy activity from the largest tenant no longer cascaded across other workloads. LIST isolation and hash distribution spread writes naturally across independent partitions.

4. Smaller tenants regained predictable performance

Their queries no longer competed with the high-volume tenant, restoring stability during peak load.

5. No application changes were required

The application continued using the same queries, with PostgreSQL handling all routing and pruning internally.

The result was a shift from a single point of friction to a scalable, predictable structure ready for continued growth.

PostgreSQL Partitioning Best Practices and Lessons Learned

Treat partitioning as an architectural adjustment, not a quick optimization. The gains here came from aligning the table’s structure with actual access patterns and following a disciplined, numbers-driven implementation. These principles guided the success of the LIST + HASH design.

1. Partition based on the columns your queries actually filter on

Use the keys your workload consistently filters on; if queries rely on composite semantics, ensure the partition key reflects the primary filtering dimension rather than the entire index definition. Equality-based predicates enable pruning and prevent unnecessary scans.

2. Start with a practical number of hash buckets

Beginning with 4-8 hash partitions is enough for most write-heavy workloads. Eight provided meaningful distribution without adding operational overhead.

3. Guarantee consistent and predictable pruning

Pruning drives most of the performance wins. Ensure:

• Non-null partition keys

• Consistent query predicates

• Equality-based filters

4. Promote tenants before they become bottlenecks

Move high-volume tenants into dedicated partitions once they cross thresholds for data size, throughput, or latency.

5. Plan for the operational footprint of large-table migrations

Expect temporary storage requirements near 2× the table’s size to accommodate staging and per-partition indexes. Index overhead alone can contribute 30–40% of a partition’s footprint.

6. Monitor partitions as independent units

Each child table maintains its own statistics and growth patterns. Regular ANALYZE, VACUUM, and index checks keep performance predictable.

7. Avoid unnecessary complexity early on

Use only the layers required by your workload. LIST + HASH provided the needed balance without adding long-term maintenance burden.

Partitioning is most effective when it amplifies the access patterns you already have, rather than forcing new ones.

Why PostgreSQL Partitioning Becomes Essential at Scale

This migration underscored a clear principle: predictable performance depends on a schema that reflects real workload patterns. Partitioning realigned the table with how the application actually operated, replacing a fragile bottleneck with a stable, scalable foundation.

By isolating the high-volume tenant and distributing ref_id activity across smaller partition units, PostgreSQL delivered consistent behavior under load without requiring any application changes.

For teams managing large or unevenly growing datasets, this approach provides a practical path forward. With the right partition keys, an appropriate number of buckets, and a safe migration plan, PostgreSQL partitioning becomes an architectural upgrade, not just a performance optimization.

As data volume and tenant diversity grow, these architectural decisions compound, turning partitioning from a tactical fix into a long-term performance advantage.

If you found this post valuable, I’d love to hear your thoughts. Let’s connect and continue the conversation on LinkedIn.

Share this article:

S

Sreeraj Rajan

SDE4

Sreeraj Rajan is a Software Engineer with a strong interest in SRE and DevOps, and over 6 years of experience building and operating production systems. He has built and deployed services in Python, Java, and TypeScript, and worked on setting up a HIPAA-compliant Kubernetes cluster using Terraform. Sreeraj enjoys working across full-stack and infrastructure domains, with a focus on reliability, scalability, and operational excellence.