SQL advanced -> Identify and Remove Duplicate records

In the following example we will identify duplicate records in the Clients table and remove them. I'm taking for granted that Customer record has it's native Id (CustomerID) and our task is purely technical.

The source dataset looks like this:
CustomerIDCustomerNamephone
3Barbara Spears64-396583
4Pestiana Oliviera000-523147
4Pestiana Oliviera000-523147
7Eva Longoria507-509875
10Erika Nass354-257413
11Somerset Dogan64-359734
14Antonella Musentah507-501478
14Antonella Musentah507-501478
.........


CustomerID that appear more than once designates the duplicate record:
SELECT CustomerID, count(*) AS cn
FROM Clients
GROUP BY CustomerID HAVING count(*) > 1;

CustomerIDcn
42
142

Now we know that records with CustomerID equal to 4 or 14 are duplicate. In the next step we have to chose only one record representing CustomerID, that means make it unique. To do this we will numerate entire table with aditional Autoincrement Key. And after this will chose minimal numerator valuie per each CustomerID as a candidate to remain in dataset. Others (not equal to Minimum) will be deleted.

ALTER TABLE Clients ADD idno Autoincrement;

CREATE UNIQUE INDEX idx1_clients_idno ON Clients (idno);

Following query creates list of minimal id numbers among duplicates and saves it as a table tmp_Min_Idno
SELECT MIN(a.idno) AS min_id, a.CustomerID
  INTO tmp_Min_Idno
FROM

  (SELECT idno, CustomerID FROM Clients) AS a
    INNER JOIN
  (SELECT CustomerID, count(*) AS cn
   FROM Clients GROUP BY CustomerID
   HAVING count(*) > 1) AS b

ON a.CustomerID=b.CustomerID

GROUP BY a.CustomerID;

Delete command can look like this:
DELETE a.*
FROM Clients AS a INNER JOIN tmp_Min_Idno AS b ON a.CustomerID = b.CustomerID
WHERE a.idno <> b.min_id;

But cause not every version of Desktop DBMS supports Delete Joins, we can achive the same goal by making some additional work:
ALTER TABLE Clients ADD remove_this INT;

UPDATE Clients SET remove_this = 0;

UPDATE Clients AS a INNER JOIN tmp_Min_Idno AS b
ON a.CustomerID=b.CustomerID
SET a.remove_this = 1
WHERE a.idno <> b.min_id;

DELETE * FROM Clients WHERE remove_this = 1;

Finally drop temporary fields:
ALTER TABLE Clients DROP COLUMN remove_this;

ALTER TABLE Clients DROP COLUMN idno;



sqlexamples.info