当前位置:网站首页>Example tutorial of SQL deduplication
Example tutorial of SQL deduplication
2022-07-05 17:14:00 【1024 questions】
1 SQL duplicate removal
2 distinct
3 group by
1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )
2. Delete data with the same name ( Keep the same name id Worth a lot of money )
4 summary
1 SQL duplicate removalSQL To remove identical data, you can use distinct keyword , Any field can be de duplicated with group by, Take the following data table as an example .
2 distinctThere are two identical records , With keywords distinct You can get rid of
De duplicate according to a single field , It can accurately remove the weight ;
When acting on multiple fields , Only if these fields are exactly the same , To go heavy ;
keyword distinct Only on the SQL The first one in the sentence , It works
It is generally used to return the number of non duplicate records , Returns the number of non duplicate entries ( Get rid of test Repetitive , That's all 6 strip )
3 group by1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )SELECT * FROM stu WHERE id IN (SELECT MAX(id) FROM stu GROUP BY `name`)
2. Delete data with the same name ( Keep the same name id Worth a lot of money )group by + count + max Remove duplicate data
1)SELECT * FROM stu
2) add group by after , Will remove duplicate data
3) Conditions ( name ) Is the quantity greater than 1 Duplicate data
SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1# If the quantity is greater than 1 Duplicate data SELECT * FROM stu WHERE `name` IN(SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1 )
4) View the duplicate data of a field id
SELECT id, COUNT(*) FROM stu GROUP BY NAME DESC HAVING(COUNT(*) > 0)
5) Query all duplicate data
SELECT * FROM stu WHERE NAME IN (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1)
5) duplicate removal
have access to distinct duplicate removal ( Returns a non duplicate user name )
Delete redundant duplicate records (name), Only keep id The biggest record .
DELETE FROM stu WHERE id NOT IN ( SELECT a.id FROM ( SELECT MAX( id ) AS id FROM stu GROUP BY `name` )a )
perhaps
DELETE FROM stu WHERE `name` IN (SELECT `name` FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) e) AND id NOT IN (SELECT id FROM (SELECT MAX(id) AS id FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) t) # Queries show duplicate data in the first few lines , Therefore, there is no need to query for a minimum
Error deletion
DELETE FROM stu WHERE name
IN (SELECT name
FROM stu GROUP BY name
HAVING COUNT(name
)>1)
AND id NOT IN (SELECT MAX(id) FROM stu GROUP BY stu
HAVING COUNT(name
)>1)
as a result of : Do not directly investigate the data as a condition of data deletion , We should start by creating a new temporary table with the data we find , The temporary table is then dropped as a condition
4 summaryName record after de duplication
SELECT `name` FROM stu GROUP BY NAME HAVING(COUNT(*) > 0)
2)
All records with duplicate names
SELECT `name` FROM stu GROUP BY NAME HAVING COUNT(*) > 1
3) Delete all duplicate records
DELETE FROM stu WHERE
name
IN
(SELECTname
FROM stu GROUP BYname
HAVING COUNT(*)>1)
You cannot query this table at the same time when deleting , This is the only problem MySQL It appears that ,oracle No, . How to solve ? We just need to add an intermediate table after finding out the results . Let the actuator think that the data we want to check is not from the table being deleted .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a)
All duplicate data is deleted , There is only one piece of data left
4) Now it's done to delete all duplicate data , Consider how to keep duplicate data id The smallest . You only need to delete the record when deleting id No duplicate data id The smallest one is OK .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a) AND id NOT IN (SELECT b.id FROM (SELECT MIN(id) id FROM stu GROUP BY `name` HAVING COUNT(*)>1) b);
There are simple ways Calculate all the data after re calculation ( Keep the minimum ID), Then delete id Not in the array
DELETE FROM stu WHERE id NOT IN (SELECT t.id FROM (SELECT MIN(id) AS id FROM stu GROUP BY `name`)t)
This is about SQL That's all for the article on deleting duplicate data , More about SQL To delete duplicate data content, please search the previous articles of software development network or continue to browse the relevant articles below. I hope you will support software development network more in the future !
边栏推荐
- 飞桨EasyDL实操范例:工业零件划痕自动识别
- Little knowledge about C language (array and string)
- 云安全日报220705:红帽PHP解释器发现执行任意代码漏洞,需要尽快升级
- 【剑指 Offer】62. 圆圈中最后剩下的数字
- Embedded -arm (bare board development) -2
- What else do you not know about new map()
- American chips are no longer proud, and Chinese chips have successfully won the first place in emerging fields
- 精准防疫有“利器”| 芯讯通助力数字哨兵护航复市
- How does the outer disk futures platform distinguish formal security?
- 国产芯片产业链两条路齐头并进,ASML真慌了而大举加大合作力度
猜你喜欢
IDC报告:腾讯云数据库稳居关系型数据库市场TOP 2!
Etcd build a highly available etcd cluster
Application of threshold homomorphic encryption in privacy Computing: Interpretation
一文了解MySQL事务隔离级别
The survey shows that the failure rate of traditional data security tools in the face of blackmail software attacks is as high as 60%
ECU简介
兰空图床苹果快捷指令
美国芯片傲不起来了,中国芯片成功在新兴领域夺得第一名
Embedded-c Language-1
MYSQL group by 有哪些注意事项
随机推荐
C#实现水晶报表绑定数据并实现打印3-二维码条形码
网上办理期货开户安全吗?网上会不会骗子比较多?感觉不太靠谱?
Is it safe to open a securities account by mobile phone? Detailed steps of how to buy stocks
基于Redis实现延时队列的优化方案小结
【微信小程序】一文读懂小程序的生命周期和路由跳转
WR | 西湖大学鞠峰组揭示微塑料污染对人工湿地菌群与脱氮功能的影响
thinkphp模板的使用
深耕5G,芯讯通持续推动5G应用百花齐放
mysql如何使用JSON_EXTRACT()取json值
Etcd 构建高可用Etcd集群
Error in compiling libssh2. OpenSSL cannot be found
Machine learning compilation lesson 2: tensor program abstraction
thinkphp3.2.3
中国广电正式推出5G服务,中国移动赶紧推出免费服务挽留用户
启牛商学院股票开户安全吗?靠谱吗?
stirring! 2022 open atom global open source summit registration is hot!
基于51单片机的电子时钟设计
Function sub file writing
张平安:加快云上数字创新,共建产业智慧生态
【二叉树】根到叶路径上的不足节点