CodeWithSQL.com
HomeSQL Advanced CourseAdvanced Query Optimization

Advanced Query Optimization

Master advanced SQL optimization with execution plan analysis, effective index usage, table partitioning strategies, and JOIN operation optimization.

4 Topics Chapter 8 of 12 Advanced

1 Execution Plans in Detail

Top

Building on execution plan basics, let's analyze complex plans with multiple tables, subqueries, and derived tables.

MySQL-- EXPLAIN ANALYZE gives actual execution times (MySQL 8.0.18+) EXPLAIN ANALYZE SELECT c.name, SUM(o.amount) AS total FROM customers c JOIN orders o ON c.id = o.customer_id WHERE o.order_date >= '2025-01-01' GROUP BY c.name HAVING SUM(o.amount) > 1000;

EXPLAIN ANALYZE actually runs the query and shows real timing, not just estimates. Use it to compare before/after when testing optimizations.

Reading Multi-Table Plans

In a multi-table EXPLAIN, each row represents one table. The optimizer decides which table to scan first (the "driving table") based on estimated cost. Key things to check:

  • Which table is scanned first? Ideally the smallest filtered result set.
  • Are all JOINs using indexes? Check the key column for each row.
  • Are there Using temporary or Using filesort in Extra? These indicate in-memory sorting or temp tables.

Key Takeaways

  • EXPLAIN ANALYZE shows actual execution times, not just estimates
  • In multi-table plans, check every row for index usage
  • The driving table should be the one with the smallest filtered result set

2 Using Indexes Effectively

Top

Beyond creating indexes, knowing which queries benefit from which index structure is critical.

Covering Indexes

A covering index contains all the columns the query needs, so the database never reads the actual table row:

SQL-- This query needs: customer_id (WHERE), order_date (WHERE), amount (SELECT) SELECT amount FROM orders WHERE customer_id = 42 AND order_date >= '2025-01-01'; -- Covering index: includes all 3 columns CREATE INDEX idx_covering ON orders (customer_id, order_date, amount);

EXPLAIN will show Using index in the Extra column when a covering index is used — this is the fastest possible access.

Index Selectivity

An index on a column with many unique values (high selectivity) is far more useful than one on a column with few distinct values:

ColumnDistinct ValuesSelectivityIndex Worth It?
email1,000,000 (unique)100%Excellent
city500MediumSometimes
gender2-3Very lowRarely helpful
is_active2 (TRUE/FALSE)Very lowOnly in composite

Key Takeaways

  • Covering indexes include all queried columns — fastest access possible
  • High-selectivity columns make better index candidates than low-selectivity ones
  • Composite index column order should match your most common query patterns

3 Partitioning & Performance

Top

Partitioning splits a large table into smaller physical pieces while appearing as one logical table. Queries that filter on the partition key only scan relevant partitions.

MySQL-- Partition orders by year CREATE TABLE orders ( id INT, customer_id INT, amount DECIMAL(10,2), order_date DATE ) PARTITION BY RANGE (YEAR(order_date)) ( PARTITION p2023 VALUES LESS THAN (2024), PARTITION p2024 VALUES LESS THAN (2025), PARTITION p2025 VALUES LESS THAN (2026), PARTITION pmax VALUES LESS THAN MAXVALUE ); -- This query only scans partition p2025: SELECT * FROM orders WHERE order_date >= '2025-01-01';

Partition Types

TypeHow It SplitsBest For
RANGEBy value ranges (dates, IDs)Time-series data (logs, orders)
LISTBy specific valuesRegion-based (country, status)
HASHBy hash of a columnEven distribution across partitions
KEYBy MySQL's internal hashSimilar to HASH but auto-managed

Partitioning isn't always the answer. It helps when: (1) tables exceed 50M+ rows, (2) queries always filter on the partition key, (3) you need to quickly drop old data (ALTER TABLE DROP PARTITION p2020). It doesn't help if your queries span all partitions.

Key Takeaways

  • Partitioning splits large tables into smaller physical segments
  • Queries filtering on the partition key scan only relevant partitions (pruning)
  • RANGE partitioning by date is the most common pattern
  • Only useful for very large tables where queries target specific ranges

4 Optimizing JOIN Operations

Top

JOINs are the most common source of slow queries. Here's how to make them fast.

JOIN Optimization Rules

RuleWhyHow
Index both sidesThe DB needs fast lookup on the joined columnIndex ON columns: orders.customer_id AND customers.id
Filter earlyFewer rows to join = fasterPut WHERE conditions that reduce rows before the JOIN
Use INNER JOIN over LEFTINNER can be reordered; LEFT cannotOnly use LEFT JOIN when you need unmatched rows
Avoid JOINing on expressionsFunctions prevent index useON a.id = b.id not ON CAST(a.id AS CHAR) = b.id
Limit result set firstDon't JOIN 10M rows if you only need 10Use CTEs to filter first, then JOIN
SQL-- SLOW: JOINs first, filters last SELECT c.name, o.amount FROM customers c JOIN orders o ON c.id = o.customer_id WHERE o.order_date >= '2025-01-01' AND c.country = 'UAE'; -- FASTER: Filter with CTE, then JOIN the small result WITH uae_customers AS ( SELECT id, name FROM customers WHERE country = 'UAE' ), recent_orders AS ( SELECT customer_id, amount FROM orders WHERE order_date >= '2025-01-01' ) SELECT c.name, o.amount FROM uae_customers c JOIN recent_orders o ON c.id = o.customer_id;

Note: Modern query optimizers are smart — they often rewrite your query internally for optimal execution. But helping the optimizer with proper indexes, explicit filtering, and CTEs produces consistently better plans, especially on complex queries.

Key Takeaways

  • Index both sides of every JOIN condition
  • Filter rows before joining to reduce the working set
  • Prefer INNER JOIN over LEFT JOIN when you don't need unmatched rows
  • Use CTEs to pre-filter large tables before joining them

What's next? Advanced Stored Procedures — error handling and performance considerations for database logic.