当前位置:网站首页>Example tutorial of SQL deduplication

Example tutorial of SQL deduplication

2022-07-05 17:14:00 1024 questions

Catalog

1 SQL duplicate removal

2 distinct

3 group by

1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )

2. Delete data with the same name ( Keep the same name id Worth a lot of money )

4 summary

1 SQL duplicate removal

SQL To remove identical data, you can use distinct keyword , Any field can be de duplicated with group by, Take the following data table as an example .

2 distinct

There are two identical records , With keywords distinct You can get rid of

De duplicate according to a single field , It can accurately remove the weight ;

When acting on multiple fields , Only if these fields are exactly the same , To go heavy ;

keyword distinct Only on the SQL The first one in the sentence , It works

It is generally used to return the number of non duplicate records , Returns the number of non duplicate entries ( Get rid of test Repetitive , That's all 6 strip )

3 group by1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )SELECT * FROM stu WHERE id IN (SELECT MAX(id) FROM stu GROUP BY `name`)

2. Delete data with the same name ( Keep the same name id Worth a lot of money )

group by + count + max Remove duplicate data

1)SELECT * FROM stu

2) add group by after , Will remove duplicate data

3) Conditions ( name ) Is the quantity greater than 1 Duplicate data

SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1# If the quantity is greater than 1 Duplicate data SELECT * FROM stu WHERE `name` IN(SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1 )

4) View the duplicate data of a field id

SELECT id, COUNT(*) FROM stu GROUP BY NAME DESC HAVING(COUNT(*) > 0)

5) Query all duplicate data

SELECT * FROM stu WHERE NAME IN (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1)

5) duplicate removal

have access to distinct duplicate removal ( Returns a non duplicate user name )

Delete redundant duplicate records (name), Only keep id The biggest record .

DELETE FROM stu WHERE id NOT IN ( SELECT a.id FROM ( SELECT MAX( id ) AS id FROM stu GROUP BY `name` )a )

perhaps

DELETE FROM stu WHERE `name` IN (SELECT `name` FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) e) AND id NOT IN (SELECT id FROM (SELECT MAX(id) AS id FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) t) # Queries show duplicate data in the first few lines , Therefore, there is no need to query for a minimum

Error deletion

DELETE FROM stu WHERE name IN (SELECT name FROM stu GROUP BY name HAVING COUNT(name)>1)
AND id NOT IN (SELECT MAX(id) FROM stu GROUP BY stu HAVING COUNT(name)>1)

as a result of : Do not directly investigate the data as a condition of data deletion , We should start by creating a new temporary table with the data we find , The temporary table is then dropped as a condition

4 summary

Name record after de duplication

SELECT `name` FROM stu GROUP BY NAME HAVING(COUNT(*) > 0)

2)

All records with duplicate names

SELECT `name` FROM stu GROUP BY NAME HAVING COUNT(*) > 1

3) Delete all duplicate records

DELETE FROM stu WHERE name IN
(SELECT name FROM stu GROUP BY name HAVING COUNT(*)>1)

You cannot query this table at the same time when deleting , This is the only problem MySQL It appears that ,oracle No, . How to solve ? We just need to add an intermediate table after finding out the results . Let the actuator think that the data we want to check is not from the table being deleted .

DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a)

All duplicate data is deleted , There is only one piece of data left

4) Now it's done to delete all duplicate data , Consider how to keep duplicate data id The smallest . You only need to delete the record when deleting id No duplicate data id The smallest one is OK .

DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a) AND id NOT IN (SELECT b.id FROM (SELECT MIN(id) id FROM stu GROUP BY `name` HAVING COUNT(*)>1) b);

There are simple ways Calculate all the data after re calculation ( Keep the minimum ID), Then delete id Not in the array

DELETE FROM stu WHERE id NOT IN (SELECT t.id FROM (SELECT MIN(id) AS id FROM stu GROUP BY `name`)t)

This is about SQL That's all for the article on deleting duplicate data , More about SQL To delete duplicate data content, please search the previous articles of software development network or continue to browse the relevant articles below. I hope you will support software development network more in the future !


原网站

版权声明
本文为[1024 questions]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207051641540816.html