Delete duplicate rows keeping one

Question

The Contacts table has accumulated duplicate rows: the same name and email pair was inserted several times, each time getting a new id. Write a statement that deletes the duplicates so that exactly one row survives for each distinct (name, email) pair. Keep the copy with the smallest id and remove the rest.The result shown below is the contents of the table after the delete, ordered by id.

Accepted Answer

BEGIN TRANSACTION;

WITH ranked AS (
    SELECT
        id,
        ROW_NUMBER() OVER (
            PARTITION BY name, email
            ORDER BY id
        ) AS rn
    FROM Contacts
)
DELETE FROM ranked
WHERE rn > 1;

COMMIT TRANSACTION;

-- Inspect the survivors
SELECT id, name, email
FROM Contacts
ORDER BY id;

The core idea is to give every row a position within its own duplicate group. PARTITION BY name, email starts the numbering over for each distinct pair, and ORDER BY id decides which copy is number 1. Because ids are ascending, the smallest id in each group always receives rn = 1 and every later copy gets 2, 3, and so on. See ROW_NUMBER() for how the ranking is built. In SQL Server you can DELETE directly from the common table expression: DELETE FROM ranked WHERE rn > 1 removes exactly the rows the CTE tagged as extras, and the change flows back to the base Contacts table. With the sample data, ids 3 and 6 (Alice), 5 (Bob), and 7 (Cara) are deleted, leaving ids 1, 2, and 4. Choosing which copy to keep is a real decision, not just a detail. Here we keep the smallest id (the oldest record); flip to ORDER BY id DESC to keep the newest, or order by any other column if that is your tie breaker. Whatever you pick, wrap the operation in a transaction as shown so you can run the SELECT first, confirm the row count, and ROLLBACK instead of COMMIT if the numbers look wrong. A delete without a defined tie breaker or a safety net is the classic way to lose the wrong rows. Before deleting, it is worth confirming the duplicates exist the way you expect - see the scenario walkthrough for the matching "find the duplicates" query. Once the table is clean, a UNIQUE constraint on (name, email) stops the problem from coming back.

id	name	email
1	Alice Wong	alice@example.com
2	Bob Diaz	bob@example.com
3	Alice Wong	alice@example.com
4	Cara Lee	cara@example.com
5	Bob Diaz	bob@example.com
6	Alice Wong	alice@example.com
7	Cara Lee	cara@example.com

id	name	email
1	Alice Wong	alice@example.com
2	Bob Diaz	bob@example.com
4	Cara Lee	cara@example.com

Delete duplicate rows keeping one

Schema and sample data

Expected result

Keep practising