Denormalization in SQL: When, Why & How to Denormalize

1 Denormalization: When, Why, and How

Normalization eliminates redundancy. Denormalization intentionally adds it back — storing duplicate or pre-calculated data to speed up read queries at the cost of more complex writes.

Why Denormalize?

A fully normalized database might require a 5-table JOIN just to display a product page. On a high-traffic site, that JOIN runs millions of times per day. Denormalization pre-computes the result so the query becomes a simple, fast lookup.

Scenario	Normalized	Denormalized
Show order with customer name	`JOIN orders + customers` every time	Store `customer_name` in orders table
Display product rating	`AVG(rating) FROM reviews` per request	Store `avg_rating` column on products
Count orders per customer	`COUNT(*) GROUP BY` every time	Store `order_count` on customers, update via trigger

Practical Example

Normalized (slow on large data)

SELECT c.name, COUNT(o.id) AS total_orders, SUM(o.amount) AS total_spent
FROM customers c
LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;

Denormalized (fast)

-- Add summary columns + keep updated with trigger
ALTER TABLE customers
ADD COLUMN total_orders INT DEFAULT 0,
ADD COLUMN total_spent DECIMAL(12,2) DEFAULT 0;

DELIMITER //
CREATE TRIGGER trg_update_stats AFTER INSERT ON orders
FOR EACH ROW
BEGIN
    UPDATE customers
    SET total_orders = total_orders + 1, total_spent = total_spent + NEW.amount
    WHERE id = NEW.customer_id;
END //
DELIMITER ;

-- Dashboard query is now instant:
SELECT name, total_orders, total_spent FROM customers;

When to Denormalize vs Stay Normalized

Denormalize When	Stay Normalized When
Read-heavy workloads (dashboards, search)	Write-heavy workloads (frequent inserts)
Complex JOINs cause performance issues	Data consistency is critical (finance)
The same aggregate is recalculated repeatedly	Data changes frequently and sync is complex

Common Patterns

Pre-calculated columns — total_orders, avg_rating maintained by triggers
Materialized views (PostgreSQL) — a view that stores results, refreshes on schedule
Summary tables — daily/monthly aggregates rebuilt by a scheduled procedure
Embedding data — storing customer_name in orders to avoid JOINs

Golden rule: Normalize first, denormalize second. Start with a properly normalized schema. Only denormalize when you have measured performance problems. Premature denormalization creates data inconsistency bugs that are very hard to track down.

Key Takeaways

Denormalization trades write complexity for read speed
Common patterns: pre-calculated columns, summary tables, embedded data
Use triggers or scheduled jobs to keep denormalized data in sync
Normalize first, then denormalize only what's proven slow

Advanced Normalization & Denormalization

1 Denormalization: When, Why, and How

Why Denormalize?

Practical Example

When to Denormalize vs Stay Normalized

Common Patterns

Key Takeaways

All Chapters in SQL Advanced Course

Courses

Services

Support

Connect