CodeWithSQL.com
HomeSQL Intermediate CourseAdvanced SQL Queries

Advanced SQL Queries

Master advanced SQL queries including nested and correlated subqueries, set operators (UNION, INTERSECT, EXCEPT), advanced JOINs and Common Table Expressions.

4 Topics Chapter 5 of 6 Intermediate

1 Subqueries (Nested & Correlated)

Top

A subquery is a query inside another query. It lets you use the result of one query as input to another — like nesting functions in a spreadsheet formula.

Nested Subquery (Runs Once)

The inner query runs first, returns a value, then the outer query uses it:

SQL-- Find products priced above the average SELECT name, price FROM products WHERE price > ( SELECT AVG(price) FROM products );
Result (avg price = 43.59)
name           | price
---------------|------
SQL Textbook   | 45.00
Mechanical KB  | 89.99
Webcam HD      | 59.99

Subquery in WHERE with IN

SQL-- Customers who have placed at least one order SELECT name, city FROM customers WHERE id IN ( SELECT DISTINCT customer_id FROM orders );

Correlated Subquery (Runs Per Row)

A correlated subquery references the outer query's row, so it runs once for each row in the outer query. It's slower but more powerful:

SQL-- For each customer, find their most expensive order SELECT c.name, o.product, o.amount FROM orders o JOIN customers c ON c.id = o.customer_id WHERE o.amount = ( SELECT MAX(o2.amount) FROM orders o2 WHERE o2.customer_id = o.customer_id -- references outer query! );
Result
name          | product       | amount
--------------|---------------|------
Sara Ahmed    | USB-C Hub     | 39.99
James Wilson  | SQL Textbook  | 45.00
Priya Sharma  | Desk Lamp     | 34.99

EXISTS: Check If Rows Exist

SQL-- Customers who have NOT placed any orders SELECT name FROM customers c WHERE NOT EXISTS ( SELECT 1 FROM orders o WHERE o.customer_id = c.id );

EXISTS vs IN: EXISTS stops as soon as it finds one matching row (efficient for large tables). IN builds the full list first. For large subqueries, EXISTS is usually faster. For small lists, IN is fine.

Key Takeaways

  • Nested subqueries run once and return a single value or list
  • Correlated subqueries run per row — they reference the outer query
  • IN (subquery) checks membership; EXISTS checks existence (and is faster for large data)
  • Subqueries can appear in WHERE, SELECT, FROM, and HAVING clauses

2 UNION, INTERSECT & EXCEPT

Top

Set operators combine the results of two or more SELECT queries into one result set. The queries must have the same number of columns with compatible data types.

UNION: Combine and Deduplicate

SQL-- All people from both customers and suppliers tables SELECT name, city FROM customers UNION SELECT contact_name, city FROM suppliers;

UNION removes duplicates. Use UNION ALL to keep them (faster, no dedup overhead):

SQL-- Keep duplicates (faster) SELECT name, city FROM customers UNION ALL SELECT contact_name, city FROM suppliers;

INTERSECT: Only Rows in Both

SQL-- Cities that have BOTH customers and suppliers SELECT city FROM customers INTERSECT SELECT city FROM suppliers;

EXCEPT: Rows in First But Not Second

SQL-- Cities that have customers but NO suppliers SELECT city FROM customers EXCEPT SELECT city FROM suppliers;
OperatorReturnsDuplicatesMySQL Support
UNIONAll rows from both, deduplicatedRemovedYes
UNION ALLAll rows from both, with duplicatesKeptYes
INTERSECTOnly rows present in bothRemovedMySQL 8.0.31+
EXCEPTRows in first but not secondRemovedMySQL 8.0.31+

Rule: Both queries must have the same number of columns with compatible types. Column names come from the first query. ORDER BY goes at the very end, after the last query.

Key Takeaways

  • UNION combines results and removes duplicates; UNION ALL keeps them
  • INTERSECT returns only rows present in both result sets
  • EXCEPT returns rows in the first set but not the second
  • All set operators require matching column counts and compatible types

3 Advanced JOINs & Performance

Top

Building on the JOIN fundamentals from the Beginner course, let's explore advanced patterns and their performance implications.

Multi-Table JOINs

Real queries often join 3, 4, or more tables:

SQL-- Order details with customer name, product info, and shipping SELECT c.name AS customer, p.name AS product, p.category, o.quantity, o.quantity * p.price AS total, s.carrier, s.tracking_no FROM orders o JOIN customers c ON o.customer_id = c.id JOIN products p ON o.product_id = p.id LEFT JOIN shipping s ON o.id = s.order_id ORDER BY total DESC;

JOIN with Aggregation

SQL-- Total spending per customer per category SELECT c.name, p.category, SUM(o.quantity * p.price) AS category_total FROM orders o JOIN customers c ON o.customer_id = c.id JOIN products p ON o.product_id = p.id GROUP BY c.name, p.category ORDER BY c.name, category_total DESC;

Non-Equi JOINs

Not all JOINs use =. You can join on ranges, inequalities, or expressions:

SQL-- Find which discount tier each order falls into SELECT o.id, o.amount, d.tier, d.discount_pct FROM orders o JOIN discount_tiers d ON o.amount BETWEEN d.min_amount AND d.max_amount;

JOIN Performance Tips

TipWhy It Matters
Always JOIN on indexed columnsWithout indexes, the DB scans every row — O(n×m) vs O(n×log m)
Filter with WHERE before JOINingReduces the number of rows the JOIN has to process
Avoid SELECT * in JOINsFetching all columns from 4 tables wastes memory and bandwidth
Use INNER JOIN unless you need NULLsLEFT JOIN is slower — don't use it if you don't need unmatched rows
Check the execution planEXPLAIN SELECT ... shows you how the DB processes your query

Quick test: Run EXPLAIN before your JOIN query to see the execution plan. If you see "Full Table Scan" on a large table, add an index on the JOIN column. This alone can speed up queries 100x.

Key Takeaways

  • Multi-table JOINs chain naturally: FROM a JOIN b ON ... JOIN c ON ...
  • Combine JOINs with GROUP BY for cross-table analytics
  • Non-equi JOINs use BETWEEN, <, > instead of =
  • Always index JOIN columns and use EXPLAIN to check performance

4 Common Table Expressions (CTEs)

Top

A CTE (Common Table Expression) is a temporary, named result set that exists for the duration of a single query. It's defined with the WITH keyword and makes complex queries much more readable than deeply nested subqueries.

Basic CTE Syntax

SQLWITH high_spenders AS ( SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id HAVING SUM(amount) > 100 ) SELECT c.name, c.city, hs.total FROM customers c JOIN high_spenders hs ON c.id = hs.customer_id ORDER BY hs.total DESC;
Result
name          | city   | total
--------------|--------|------
Sara Ahmed    | Dubai  | 132.48
James Wilson  | London | 134.99

Compare this to the subquery version — the CTE is much more readable:

Subquery version (harder to read)SELECT c.name, c.city, hs.total FROM customers c JOIN ( SELECT customer_id, SUM(amount) AS total FROM orders GROUP BY customer_id HAVING SUM(amount) > 100 ) hs ON c.id = hs.customer_id;

Multiple CTEs

Chain multiple CTEs with commas — each can reference the ones before it:

SQLWITH customer_totals AS ( SELECT customer_id, SUM(amount) AS total, COUNT(*) AS order_count FROM orders GROUP BY customer_id ), customer_tiers AS ( SELECT *, CASE WHEN total >= 200 THEN 'Gold' WHEN total >= 100 THEN 'Silver' ELSE 'Bronze' END AS tier FROM customer_totals ) SELECT c.name, ct.total, ct.order_count, ct.tier FROM customer_tiers ct JOIN customers c ON c.id = ct.customer_id ORDER BY ct.total DESC;

Recursive CTEs

Recursive CTEs call themselves — perfect for hierarchies like employee/manager trees or category trees:

SQL-- Build an org chart from the employees table WITH RECURSIVE org_chart AS ( -- Base case: top-level (no manager) SELECT id, name, manager_id, 0 AS level FROM employees WHERE manager_id IS NULL UNION ALL -- Recursive case: employees who report to someone in org_chart SELECT e.id, e.name, e.manager_id, oc.level + 1 FROM employees e JOIN org_chart oc ON e.manager_id = oc.id ) SELECT CONCAT(REPEAT(' ', level), name) AS hierarchy FROM org_chart ORDER BY level, name;
Result
hierarchy
------------------
Alice CEO
  Bob VP
    Carol Manager
      Dave Developer
      Eve Designer

CTE vs Subquery vs Temp Table: CTEs exist only for one query (no cleanup needed). Subqueries do the same but are harder to read. Temp tables persist across multiple queries in a session. Use CTEs for readability; temp tables for reuse across multiple queries.

Key Takeaways

  • WITH name AS (SELECT ...) defines a CTE — a named temporary result set
  • CTEs make complex queries far more readable than nested subqueries
  • Chain multiple CTEs with commas; each can reference previous ones
  • Recursive CTEs (WITH RECURSIVE) handle hierarchies and tree-structured data
  • CTEs exist only for one query — no temp table cleanup needed

What's next? You're now writing sophisticated queries. The final chapter of the Intermediate course covers SQL Indexes — the key to making all these queries run fast on large datasets.