当前位置:网站首页>Example tutorial of SQL deduplication
Example tutorial of SQL deduplication
2022-07-05 17:14:00 【1024 questions】
1 SQL duplicate removal
2 distinct
3 group by
1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )
2. Delete data with the same name ( Keep the same name id Worth a lot of money )
4 summary
1 SQL duplicate removalSQL To remove identical data, you can use distinct keyword , Any field can be de duplicated with group by, Take the following data table as an example .
2 distinctThere are two identical records , With keywords distinct You can get rid of
De duplicate according to a single field , It can accurately remove the weight ;
When acting on multiple fields , Only if these fields are exactly the same , To go heavy ;
keyword distinct Only on the SQL The first one in the sentence , It works



It is generally used to return the number of non duplicate records , Returns the number of non duplicate entries ( Get rid of test Repetitive , That's all 6 strip )

SELECT * FROM stu WHERE id IN (SELECT MAX(id) FROM stu GROUP BY `name`)
group by + count + max Remove duplicate data
1)SELECT * FROM stu

2) add group by after , Will remove duplicate data

3) Conditions ( name ) Is the quantity greater than 1 Duplicate data
SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1# If the quantity is greater than 1 Duplicate data SELECT * FROM stu WHERE `name` IN(SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1 )
4) View the duplicate data of a field id
SELECT id, COUNT(*) FROM stu GROUP BY NAME DESC HAVING(COUNT(*) > 0)
5) Query all duplicate data
SELECT * FROM stu WHERE NAME IN (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1)
5) duplicate removal
have access to distinct duplicate removal ( Returns a non duplicate user name )
Delete redundant duplicate records (name), Only keep id The biggest record .
DELETE FROM stu WHERE id NOT IN ( SELECT a.id FROM ( SELECT MAX( id ) AS id FROM stu GROUP BY `name` )a )perhaps
DELETE FROM stu WHERE `name` IN (SELECT `name` FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) e) AND id NOT IN (SELECT id FROM (SELECT MAX(id) AS id FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) t) # Queries show duplicate data in the first few lines , Therefore, there is no need to query for a minimum Error deletion
DELETE FROM stu WHERE name IN (SELECT name FROM stu GROUP BY name HAVING COUNT(name)>1)
AND id NOT IN (SELECT MAX(id) FROM stu GROUP BY stu HAVING COUNT(name)>1)
as a result of : Do not directly investigate the data as a condition of data deletion , We should start by creating a new temporary table with the data we find , The temporary table is then dropped as a condition
4 summaryName record after de duplication
SELECT `name` FROM stu GROUP BY NAME HAVING(COUNT(*) > 0)2)
All records with duplicate names
SELECT `name` FROM stu GROUP BY NAME HAVING COUNT(*) > 13) Delete all duplicate records
DELETE FROM stu WHERE
nameIN
(SELECTnameFROM stu GROUP BYnameHAVING COUNT(*)>1)

You cannot query this table at the same time when deleting , This is the only problem MySQL It appears that ,oracle No, . How to solve ? We just need to add an intermediate table after finding out the results . Let the actuator think that the data we want to check is not from the table being deleted .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a)All duplicate data is deleted , There is only one piece of data left

4) Now it's done to delete all duplicate data , Consider how to keep duplicate data id The smallest . You only need to delete the record when deleting id No duplicate data id The smallest one is OK .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a) AND id NOT IN (SELECT b.id FROM (SELECT MIN(id) id FROM stu GROUP BY `name` HAVING COUNT(*)>1) b);
There are simple ways Calculate all the data after re calculation ( Keep the minimum ID), Then delete id Not in the array
DELETE FROM stu WHERE id NOT IN (SELECT t.id FROM (SELECT MIN(id) AS id FROM stu GROUP BY `name`)t)This is about SQL That's all for the article on deleting duplicate data , More about SQL To delete duplicate data content, please search the previous articles of software development network or continue to browse the relevant articles below. I hope you will support software development network more in the future !
边栏推荐
- 齐宣王典故
- Embedded-c language-6
- 机器学习01:绪论
- WR | 西湖大学鞠峰组揭示微塑料污染对人工湿地菌群与脱氮功能的影响
- Little knowledge about C language (array and string)
- It is forbidden to copy content JS code on the website page
- Embedded-c Language-2
- China Radio and television officially launched 5g services, and China Mobile quickly launched free services to retain users
- 精准防疫有“利器”| 芯讯通助力数字哨兵护航复市
- 中国广电正式推出5G服务,中国移动赶紧推出免费服务挽留用户
猜你喜欢

Etcd 构建高可用Etcd集群

阈值同态加密在隐私计算中的应用:解读

一个满分的项目文档是如何书写的|得物技术

The two ways of domestic chip industry chain go hand in hand. ASML really panicked and increased cooperation on a large scale

【机器人坐标系第一讲】

China Radio and television officially launched 5g services, and China Mobile quickly launched free services to retain users

stirring! 2022 open atom global open source summit registration is hot!

American chips are no longer proud, and Chinese chips have successfully won the first place in emerging fields

Iphone14 with pill screen may trigger a rush for Chinese consumers
一文了解MySQL事务隔离级别
随机推荐
【729. 我的日程安排表 I】
Is it safe and reliable to open futures accounts on koufu.com? How to distinguish whether the platform is safe?
通过proc接口调试内核代码
高数 | 旋转体体积计算方法汇总、二重积分计算旋转体体积
[729. My schedule I]
腾讯音乐上线新产品“曲易买”,提供音乐商用版权授权
PHP talent recruitment system development source code recruitment website source code secondary development
Use of ThinkPHP template
NPM installation
干货!半监督预训练对话模型 SPACE
Etcd 构建高可用Etcd集群
What is ROM
How does the outer disk futures platform distinguish formal security?
ECU简介
网上办理期货开户安全吗?网上会不会骗子比较多?感觉不太靠谱?
IDC报告:腾讯云数据库稳居关系型数据库市场TOP 2!
ThoughtWorks global CTO: build the architecture according to needs, and excessive engineering will only "waste people and money"
npm安装
飞桨EasyDL实操范例:工业零件划痕自动识别
叩富网开期货账户安全可靠吗?怎么分辨平台是否安全?