MySQL: Delete All Duplicate Rows Except the Earliest One in One SQL
You want to add a unique index to a table, and unfortunately, there are already many duplicate rows in it. Manually find and delete these rows is time-wasting and error-prone. So why won't we just write one SQL statement and quickly resolve it?
First try, I wrote the following statement, and it won't work:
DELETE FROM PromotionSkus A
WHERE
A.SkuId IN (SELECT SkuId FROM PromotionSkus B GROUP BY B.SkuId HAVING COUNT(B.SkuId) > 1)
AND
A.Id NOT IN (SELECT MIN(Id) FROM PromotionSkus C GROUP BY C.SkuId HAVING COUNT(C.SkuId) > 1);
AND this one below works!
DELETE FROM PromotionSkus A
WHERE
A.Id NOT IN (SELECT Id FROM (SELECT MIN(Id) AS Id, COUNT(SkuId) AS Total FROM PromotionSkus GROUP BY SkuId HAVING Total > 1) AS B)
AND
A.SkuId IN (SELECT SkuId FROM (SELECT SkuId FROM PromotionSkus GROUP BY SkuId HAVING COUNT(SkuId) > 1) AS C);
The reason is well explained in this brilliant article.
Another mysql tip: using mysqldump export a table with one line one row.
mysqldump --databases YourDataBaseName --tables YourTableName --skip-extended-insert
Why do we need that? It is much easier to compare !