CodeWithSQL.com
HomeSQL Advanced CourseAdvanced Normalization

Advanced Normalization & Denormalization

Learn when and how to denormalize your database. Understand the trade-offs between normalization and denormalization for performance optimization.

1 Topic Chapter 4 of 12 Advanced

1 Denormalization: When, Why, and How

Normalization eliminates redundancy. Denormalization intentionally adds it back — storing duplicate or pre-calculated data to speed up read queries at the cost of more complex writes.

Why Denormalize?

A fully normalized database might require a 5-table JOIN just to display a product page. On a high-traffic site, that JOIN runs millions of times per day. Denormalization pre-computes the result so the query becomes a simple, fast lookup.

ScenarioNormalizedDenormalized
Show order with customer nameJOIN orders + customers every timeStore customer_name in orders table
Display product ratingAVG(rating) FROM reviews per requestStore avg_rating column on products
Count orders per customerCOUNT(*) GROUP BY every timeStore order_count on customers, update via trigger

Practical Example

Normalized (slow on large data)SELECT c.name, COUNT(o.id) AS total_orders, SUM(o.amount) AS total_spent FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id, c.name;
Denormalized (fast)-- Add summary columns + keep updated with trigger ALTER TABLE customers ADD COLUMN total_orders INT DEFAULT 0, ADD COLUMN total_spent DECIMAL(12,2) DEFAULT 0; DELIMITER // CREATE TRIGGER trg_update_stats AFTER INSERT ON orders FOR EACH ROW BEGIN UPDATE customers SET total_orders = total_orders + 1, total_spent = total_spent + NEW.amount WHERE id = NEW.customer_id; END // DELIMITER ; -- Dashboard query is now instant: SELECT name, total_orders, total_spent FROM customers;

When to Denormalize vs Stay Normalized

Denormalize WhenStay Normalized When
Read-heavy workloads (dashboards, search)Write-heavy workloads (frequent inserts)
Complex JOINs cause performance issuesData consistency is critical (finance)
The same aggregate is recalculated repeatedlyData changes frequently and sync is complex

Common Patterns

  • Pre-calculated columnstotal_orders, avg_rating maintained by triggers
  • Materialized views (PostgreSQL) — a view that stores results, refreshes on schedule
  • Summary tables — daily/monthly aggregates rebuilt by a scheduled procedure
  • Embedding data — storing customer_name in orders to avoid JOINs

Golden rule: Normalize first, denormalize second. Start with a properly normalized schema. Only denormalize when you have measured performance problems. Premature denormalization creates data inconsistency bugs that are very hard to track down.

Key Takeaways

  • Denormalization trades write complexity for read speed
  • Common patterns: pre-calculated columns, summary tables, embedded data
  • Use triggers or scheduled jobs to keep denormalized data in sync
  • Normalize first, then denormalize only what's proven slow

What's next? Transactions, Locks & Deadlocks — understand lock types and prevent deadlocks.