SQL Practice Problem

Days between each customer order

Hard Window Functions

Schema and sample data

Orders
order_idcustomer_idorder_date
11012024-01-05
21022024-01-10
31012024-01-20
41032024-02-15
51012024-02-10
61022024-03-01
Show setup SQL (copy to run)
CREATE TABLE Orders (
    order_id    INT PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date  DATE NOT NULL
);

INSERT INTO Orders (order_id, customer_id, order_date) VALUES
(1, 101, '2024-01-05'),
(2, 102, '2024-01-10'),
(3, 101, '2024-01-20'),
(4, 103, '2024-02-15'),
(5, 101, '2024-02-10'),
(6, 102, '2024-03-01');
Your task

The Orders table records every order, with more than one order per customer. For each order, return the customer_id, the order_date, the date of that customer's previous order as prev_order_date, and the number of days between the two orders as days_since_prev.

Each customer's orders should be compared in date order, and the comparison must never cross into another customer. A customer's very first order has no earlier order, so both prev_order_date and days_since_prev must be NULL for that row. Sort the result by customer_id, then by order_date.

Expected result

Your query should return
customer_idorder_dateprev_order_datedays_since_prev
1012024-01-05NULLNULL
1012024-01-202024-01-0515
1012024-02-102024-01-2021
1022024-01-10NULLNULL
1022024-03-012024-01-1051
1032024-02-15NULLNULL
Show hint

You need to look back one row inside each customer's history. LAG(order_date) with a window of PARTITION BY customer_id ORDER BY order_date pulls the previous order date without a self join. Feed that lagged value and the current order_date into DATEDIFF(day, ...) to get the gap. The first order in each partition returns NULL from LAG, which makes the gap NULL automatically.

Try to write the query yourself before you open this.

Show solution and explanation
SELECT
    customer_id,
    order_date,
    LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date) AS prev_order_date,
    DATEDIFF(day,
             LAG(order_date) OVER (PARTITION BY customer_id ORDER BY order_date),
             order_date) AS days_since_prev
FROM Orders
ORDER BY customer_id, order_date;

LAG returns a value from an earlier row in the same window. The window PARTITION BY customer_id ORDER BY order_date restarts the numbering for every customer and walks their orders from oldest to newest, so LAG(order_date) is the date of the immediately preceding order for that same customer. Because each partition is independent, the previous value never leaks in from a different customer.

For the earliest order in each partition there is no preceding row, so LAG returns NULL. That is exactly what we want for a customer's first order. Feeding that NULL into DATEDIFF as the start date makes days_since_prev come out NULL too, so the first-order rows fall out correctly without a special CASE. For customer 101 the gaps are 15 days (Jan 05 to Jan 20) and 21 days (Jan 20 to Feb 10); for customer 102 the single gap is 51 days (Jan 10 to Mar 01); customer 103 has only one order, so it is all NULL.

A common mistake is to write a self join such as DATEDIFF(day, prev.order_date, cur.order_date) matched on prev.order_date < cur.order_date. That pairs each order with every earlier order, not just the immediately previous one, so you then have to filter down to the max earlier date, which is slower and easy to get wrong. LAG expresses "the row just before this one" directly. Also note the arguments order: DATEDIFF(day, start, end) counts start to end, so passing the previous date first and the current date second yields a positive number of days.

Keep practising

Work through more Hard exercises, or test yourself with the SQL interview questions.