Schema and sample data
| order_id | customer_id | order_date |
|---|---|---|
| 1 | 101 | 2024-01-05 |
| 2 | 102 | 2024-01-10 |
| 3 | 101 | 2024-01-20 |
| 4 | 103 | 2024-02-15 |
| 5 | 101 | 2024-02-10 |
| 6 | 102 | 2024-03-01 |
Show setup SQL (copy to run)
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE NOT NULL
);
INSERT INTO Orders (order_id, customer_id, order_date) VALUES
(1, 101, '2024-01-05'),
(2, 102, '2024-01-10'),
(3, 101, '2024-01-20'),
(4, 103, '2024-02-15'),
(5, 101, '2024-02-10'),
(6, 102, '2024-03-01');
The Orders table records every order, with more than one order per customer. For each order, return the customer_id, the order_date, the date of that customer's previous order as prev_order_date, and the number of days between the two orders as days_since_prev.
Each customer's orders should be compared in date order, and the comparison must never cross into another customer. A customer's very first order has no earlier order, so both prev_order_date and days_since_prev must be NULL for that row. Sort the result by customer_id, then by order_date.
Expected result
| customer_id | order_date | prev_order_date | days_since_prev |
|---|---|---|---|
| 101 | 2024-01-05 | NULL | NULL |
| 101 | 2024-01-20 | 2024-01-05 | 15 |
| 101 | 2024-02-10 | 2024-01-20 | 21 |
| 102 | 2024-01-10 | NULL | NULL |
| 102 | 2024-03-01 | 2024-01-10 | 51 |
| 103 | 2024-02-15 | NULL | NULL |
Show hint
You need to look back one row inside each customer's history. LAG(order_date) with a window of PARTITION BY customer_id ORDER BY order_date pulls the previous order date without a self join. Feed that lagged value and the current order_date into DATEDIFF(day, ...) to get the gap. The first order in each partition returns NULL from LAG, which makes the gap NULL automatically.
Try to write the query yourself before you open this.
Show solution and explanation
SELECT
customer_id,
order_date,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_order_date,
DATEDIFF(day,
LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date),
order_date) AS days_since_prev
FROM Orders
ORDER BY customer_id, order_date;
LAG returns a value from an earlier row in the same window. The window PARTITION BY customer_id ORDER BY order_date restarts the numbering for every customer and walks their orders from oldest to newest, so LAG(order_date) is the date of the immediately preceding order for that same customer. Because each partition is independent, the previous value never leaks in from a different customer.
For the earliest order in each partition there is no preceding row, so LAG returns NULL. That is exactly what we want for a customer's first order. Feeding that NULL into DATEDIFF as the start date makes days_since_prev come out NULL too, so the first-order rows fall out correctly without a special CASE. For customer 101 the gaps are 15 days (Jan 05 to Jan 20) and 21 days (Jan 20 to Feb 10); for customer 102 the single gap is 51 days (Jan 10 to Mar 01); customer 103 has only one order, so it is all NULL.
A common mistake is to write a self join such as DATEDIFF(day, prev.order_date, cur.order_date) matched on prev.order_date < cur.order_date. That pairs each order with every earlier order, not just the immediately previous one, so you then have to filter down to the max earlier date, which is slower and easy to get wrong. LAG expresses "the row just before this one" directly. Also note the arguments order: DATEDIFF(day, start, end) counts start to end, so passing the previous date first and the current date second yields a positive number of days.