Software Engineering & Digital Products for Global Enterprises since 2006
CMMi Level 3SOC 2ISO 27001
Menu
View all services
Staff Augmentation
Embed senior engineers in your team within weeks.
Dedicated Teams
A ring-fenced squad with PM, leads, and engineers.
Build-Operate-Transfer
We hire, run, and transfer the team to you.
Contract-to-Hire
Try the talent. Convert when you're ready.
ForceHQ
Skill testing, interviews and ranking — powered by AI.
RoboRingo
Build, deploy and monitor voice agents without code.
MailGovern
Policy, retention and compliance for enterprise email.
Vishing
Test and train staff against AI-driven voice attacks.
CyberForceHQ
Continuous, adaptive security training for every team.
IDS Load Balancer
Built for Multi Instance InDesign Server, to distribute jobs.
AutoVAPT.ai
AI agent for continuous, automated vulnerability and penetration testing.
Salesforce + InDesign Connector
Bridge Salesforce data into InDesign to design print catalogues at scale.
View all solutions
Banking, Financial Services & Insurance
Cloud, digital and legacy modernisation across financial entities.
Healthcare
Clinical platforms, patient engagement, and connected medical devices.
Pharma & Life Sciences
Trial systems, regulatory data, and field-force enablement.
Professional Services & Education
Workflow automation, learning platforms, and consulting tooling.
Media & Entertainment
AI video processing, OTT platforms, and content workflows.
Technology & SaaS
Product engineering, integrations, and scale for tech companies.
Retail & eCommerce
Shopify, print catalogues, web-to-print, and order automation.
View all industries
Blog
Engineering notes, opinions, and field reports.
Case Studies
How clients shipped — outcomes, stack, lessons.
White Papers
Deep-dives on AI, talent models, and platforms.
Portfolio
Selected work across industries.
View all resources
About Us
Who we are, our story, and what drives us.
Co-Innovation
How we partner to build new products together.
Careers
Open roles and what it's like to work here.
News
Press, announcements, and industry updates.
Leadership
The people steering MetaDesign.
Locations
Gurugram, Brisbane, Detroit and beyond.
Contact Us
Talk to sales, hiring, or partnerships.
Request TalentStart a Project
Software Engineering

RDBMS Database — Performance Improvement of SQL Queries

AG
Amit Gupta
Founder & CEO
January 16, 2025
10 min read
RDBMS Database — Performance Improvement of SQL Queries — Software Engineering | MetaDesign Solutions

Introduction: Why SQL Query Performance Is an Architectural Concern

SQL query performance isn't just a DBA concern — it's an architectural responsibility that impacts application latency, infrastructure costs, user experience, and system scalability. A single poorly optimised query executing 10,000 times per day can consume more CPU than the rest of the application combined, creating cascading bottlenecks across connection pools, application threads, and downstream services.

Modern RDBMS engines (PostgreSQL 17, MySQL 9, SQL Server 2025) include sophisticated query planners, adaptive execution, and JIT compilation — but these optimisations only work when developers provide the right signals through proper indexing, schema design, and query structure. This guide covers the complete SQL performance toolkit — from EXPLAIN ANALYZE interpretation through advanced indexing strategies, join optimisation, partitioning, caching, and production monitoring.

EXPLAIN ANALYZE: Reading Query Execution Plans

Every performance investigation begins with understanding what the query planner decided:

  • EXPLAIN vs EXPLAIN ANALYZE: EXPLAIN shows the planner's estimated execution plan without running the query. EXPLAIN ANALYZE actually executes the query and shows real timing, row counts, and loop iterations — essential for identifying estimation errors where the planner's predicted rows differ significantly from actual rows.
  • Sequential Scan vs Index Scan: Sequential scans read every row in the table — acceptable for small tables (<10K rows) or queries returning >15% of rows. Index scans use B-tree traversal for targeted row access. When EXPLAIN shows a Seq Scan on a large table with a selective WHERE clause, a missing index is likely the cause.
  • Nested Loop vs Hash Join vs Merge Join: The planner selects join algorithms based on table sizes, available indexes, and memory. Nested loops suit small-to-large table joins with indexes; hash joins suit unsorted equi-joins on larger tables; merge joins suit pre-sorted data. Unexpected join type selection often indicates stale statistics.
  • Sort and Materialize: External sorts (Sort Method: external merge) indicate work_mem is insufficient — the query spills to disk. Increase work_mem for the session or globally. Materialize nodes indicate subquery results are being cached — sometimes beneficial, sometimes a sign that the query should be restructured.
  • Buffers and I/O: Use EXPLAIN (ANALYZE, BUFFERS) to see shared buffer hits vs reads — a low hit ratio (<95%) indicates the buffer cache is undersized or the query accesses too many pages. Track temp read/written for disk-spilling operations.

Advanced Indexing: B-tree, GIN, BRIN, and Partial Indexes

Indexes are the single most impactful performance tool — but index selection requires strategy:

  • B-tree Indexes: The default index type for equality and range queries on ordered data. Create composite indexes following the leftmost prefix ruleCREATE INDEX ON orders (customer_id, created_at) supports queries filtering on customer_id alone or customer_id + created_at, but not created_at alone. Column order matters: place equality columns first, then range columns.
  • GIN Indexes (Generalised Inverted): Optimal for JSONB columns, full-text search (tsvector), and array containment queries. CREATE INDEX ON products USING gin (attributes jsonb_path_ops) enables fast @> containment queries on JSONB documents.
  • BRIN Indexes (Block Range): Extremely compact indexes for naturally ordered data (timestamps, auto-increment IDs). A BRIN index on a 100M-row time-series table uses ~100KB vs ~2GB for an equivalent B-tree — 20,000x smaller with only marginally slower lookups.
  • Partial Indexes: Index only rows matching a condition — CREATE INDEX ON orders (created_at) WHERE status = 'pending'. If only 5% of orders are pending, the partial index is 20x smaller and faster than indexing all orders.
  • Covering Indexes (INCLUDE): Add non-key columns to indexes to enable index-only scans — CREATE INDEX ON orders (customer_id) INCLUDE (total, status). The query reads data directly from the index without touching the heap table, eliminating random I/O for covered queries.

Query Rewriting: Eliminating Common Anti-Patterns

Rewrite queries to align with how the query planner optimises execution:

  • Avoid SELECT *: Fetching all columns prevents covering index usage, increases I/O, and transfers unnecessary data over the network. Specify only needed columns — SELECT id, name, email FROM users WHERE active = true.
  • Replace Correlated Subqueries: Correlated subqueries execute once per outer row — SELECT * FROM orders WHERE total > (SELECT AVG(total) FROM orders WHERE customer_id = orders.customer_id) is O(n²). Rewrite with window functions: SELECT * FROM (SELECT *, AVG(total) OVER (PARTITION BY customer_id) AS avg_total FROM orders) t WHERE total > avg_total.
  • Use EXISTS Instead of IN: For subquery filtering, EXISTS short-circuits on the first match while IN materialises the entire subquery result. WHERE EXISTS (SELECT 1 FROM orders WHERE orders.user_id = users.id) outperforms WHERE id IN (SELECT user_id FROM orders) for large datasets.
  • N+1 Query Problem: ORMs frequently generate N+1 patterns — one query for the parent list, then N queries for related data. Use JOIN or batch WHERE id IN (...) to fetch related data in a single query. Monitor query counts per request to detect N+1 patterns.
  • Pagination with Keyset: OFFSET-based pagination degrades linearly — OFFSET 1000000 scans and discards 1M rows. Use keyset pagination: WHERE id > :last_id ORDER BY id LIMIT 50 — constant performance regardless of page depth.

Join Optimisation and Window Functions

Joins and aggregations are where most query performance is won or lost:

  • Join Order Matters: The planner evaluates join orders, but for queries with 10+ tables, it may not explore optimal orderings (PostgreSQL join_collapse_limit defaults to 8). Hint the planner by restructuring CTEs or subqueries to guide join order for complex queries.
  • Use INNER JOIN Over LEFT JOIN: When the relationship guarantees matching rows, INNER JOIN gives the planner more optimisation freedom — it can reorder joins and apply filters earlier. LEFT JOIN preserves the left table's row order, constraining the planner.
  • Window Functions Over Self-Joins: Replace self-joins for running totals, rankings, and comparisons — ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) eliminates the need to join the table against itself for top-N-per-group queries.
  • Materialised CTEs: PostgreSQL materialises CTEs by default (barrier optimisation). Use WITH cte AS NOT MATERIALIZED (...) to allow the planner to inline the CTE and push down predicates. Materialised CTEs prevent filter pushdown, sometimes causing full scans.
  • Aggregate Pushdown: When aggregating joined tables, push aggregation before the join where possible — SELECT d.name, s.total FROM departments d JOIN (SELECT dept_id, SUM(amount) AS total FROM sales GROUP BY dept_id) s ON d.id = s.dept_id aggregates before the join, reducing the join's row count.

Transform Your Publishing Workflow

Our experts can help you build scalable, API-driven publishing systems tailored to your business.

Book a free consultation

Table Partitioning and Horizontal Sharding

Scale beyond single-table limits with data partitioning strategies:

  • Range Partitioning: Partition by time range — CREATE TABLE orders_2025_q1 PARTITION OF orders FOR VALUES FROM ('2025-01-01') TO ('2025-04-01'). Queries filtering by date automatically scan only relevant partitions (partition pruning), reducing I/O by orders of magnitude for time-series data.
  • Hash Partitioning: Distribute rows across N partitions using a hash function — PARTITION BY HASH (user_id). Provides even data distribution for workloads without natural range boundaries. Ideal for high-cardinality keys (user_id, order_id).
  • List Partitioning: Partition by discrete values — PARTITION BY LIST (country) creates per-country partitions for multi-tenant data isolation and region-specific query performance.
  • Partition Maintenance: Implement automated partition management — create future partitions before they're needed (monthly cron jobs), detach and archive old partitions, and maintain partition-level indexes. Tools like pg_partman (PostgreSQL) automate this lifecycle.
  • Horizontal Sharding: When single-server capacity is exhausted, shard across multiple database servers using application-level routing (shard key → server mapping), Citus for PostgreSQL distributed tables, or Vitess for MySQL sharding. Sharding introduces cross-shard query complexity — design shard keys to minimise cross-shard joins.

Connection Pooling, Caching, and Configuration Tuning

Optimise the infrastructure layer around SQL queries:

  • Connection Pooling: Use PgBouncer (PostgreSQL), ProxySQL (MySQL), or HikariCP (Java applications) — connection creation costs 5-50ms per connection. Pool sizes should match: connections = (core_count * 2) + effective_spindle_count. Over-pooling causes lock contention; under-pooling causes connection wait timeouts.
  • Application-Level Caching: Cache frequently read, rarely changed data in Redis or Memcached — user profiles, configuration, product catalogues. Use cache-aside pattern: check cache → miss → query database → populate cache with TTL. Invalidate on writes to prevent stale data.
  • Materialised Views: Pre-compute expensive aggregations — CREATE MATERIALIZED VIEW monthly_revenue AS SELECT .... Refresh periodically (REFRESH MATERIALIZED VIEW CONCURRENTLY) for dashboards and reports that tolerate minutes-old data. Index materialised views for fast access.
  • Memory Configuration: PostgreSQL: set shared_buffers to 25% of RAM, effective_cache_size to 75% of RAM, and work_mem to 256MB-1GB for analytical queries. MySQL: set innodb_buffer_pool_size to 70-80% of available RAM. Monitor buffer hit ratios to validate settings.
  • WAL and Checkpoint Tuning: For write-heavy workloads, increase checkpoint_completion_target to 0.9, raise wal_buffers to 64MB, and configure max_wal_size to reduce checkpoint frequency — preventing I/O spikes during checkpoint flushes.

Production Monitoring, Maintenance, and MDS Database Services

Sustain query performance with continuous monitoring and proactive maintenance:

  • Slow Query Logging: Enable PostgreSQL log_min_duration_statement = 100ms or MySQL slow_query_log with long_query_time = 0.1 — capture queries exceeding latency thresholds for analysis. Use pgBadger or pt-query-digest to aggregate and rank slow queries by frequency and total time.
  • Statistics Maintenance: Run ANALYZE (PostgreSQL) or ANALYZE TABLE (MySQL) regularly — the query planner depends on accurate table statistics (row counts, value distributions, null ratios) for optimal plan selection. Stale statistics cause the planner to choose wrong join methods and index usage.
  • Index Health: Monitor index bloat with pgstattuple (PostgreSQL) — bloated indexes waste memory and slow scans. REINDEX CONCURRENTLY rebuilds indexes without locking the table. Drop unused indexes identified by pg_stat_user_indexes where idx_scan = 0.
  • Vacuum and Dead Tuples: PostgreSQL's MVCC creates dead tuples on updates/deletes — autovacuum reclaims space but may fall behind on high-write tables. Monitor n_dead_tup in pg_stat_user_tables and tune autovacuum_vacuum_scale_factor for large tables.
  • Query Performance Dashboards: Build Grafana dashboards with pg_stat_statements (PostgreSQL) or performance_schema (MySQL) — track query latency percentiles (p50/p95/p99), throughput, cache hit ratios, lock wait times, and connection pool utilisation. Set alerts for SLA breaches.

MetaDesign Solutions provides end-to-end database performance engineering — from query optimisation and index strategy through partitioning design, caching architecture, and production monitoring setup for enterprise PostgreSQL, MySQL, and SQL Server deployments.

FAQ

Frequently Asked Questions

Common questions about this topic, answered by our engineering team.

Start with EXPLAIN ANALYZE to identify the bottleneck — sequential scans on large tables usually indicate missing indexes. Create targeted B-tree indexes on columns used in WHERE, JOIN, and ORDER BY clauses. Use covering indexes (INCLUDE) to enable index-only scans. For JSONB or full-text search, use GIN indexes. For time-series data, BRIN indexes provide 20,000x smaller indexes than B-trees.

Enable slow query logging (PostgreSQL log_min_duration_statement or MySQL slow_query_log), aggregate with pgBadger or pt-query-digest, then run EXPLAIN (ANALYZE, BUFFERS) on top offenders. Look for sequential scans on large tables, disk-spilling sorts (external merge), low buffer hit ratios, and estimation errors where planned vs actual row counts diverge significantly.

Denormalize when read performance is critical and write frequency is low — reporting databases, analytics dashboards, and read-heavy APIs. Use materialised views for pre-computed aggregations rather than structural denormalization. For OLTP systems, maintain normalised schemas and add caching (Redis) or read replicas instead of denormalising.

Partitioning enables partition pruning — queries with WHERE clauses on the partition key scan only relevant partitions instead of the entire table. A 1-billion-row time-series table partitioned monthly means date-range queries scan only 1-2 partitions (~83M rows) instead of the full billion. Partitions also enable parallel index maintenance and efficient data archival.

Use PgBouncer (PostgreSQL) or ProxySQL (MySQL) with pool sizes following the formula: connections = (core_count × 2) + effective_spindle_count. For a 4-core server, start with 10 connections. Over-pooling causes lock contention; under-pooling causes timeouts. Monitor connection wait times and adjust based on actual workload patterns.

Discussion

Join the Conversation

Ready when you are

Let's build something great together.

A 30-minute call with a principal engineer. We'll listen, sketch, and tell you whether we're the right partner — even if the answer is no.

Talk to a strategist
Need help with your project? Let's talk.
Book a call