当前位置:网站首页>Example tutorial of SQL deduplication
Example tutorial of SQL deduplication
2022-07-05 17:14:00 【1024 questions】
1 SQL duplicate removal
2 distinct
3 group by
1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )
2. Delete data with the same name ( Keep the same name id Worth a lot of money )
4 summary
1 SQL duplicate removalSQL To remove identical data, you can use distinct keyword , Any field can be de duplicated with group by, Take the following data table as an example .
2 distinctThere are two identical records , With keywords distinct You can get rid of
De duplicate according to a single field , It can accurately remove the weight ;
When acting on multiple fields , Only if these fields are exactly the same , To go heavy ;
keyword distinct Only on the SQL The first one in the sentence , It works



It is generally used to return the number of non duplicate records , Returns the number of non duplicate entries ( Get rid of test Repetitive , That's all 6 strip )

SELECT * FROM stu WHERE id IN (SELECT MAX(id) FROM stu GROUP BY `name`)
group by + count + max Remove duplicate data
1)SELECT * FROM stu

2) add group by after , Will remove duplicate data

3) Conditions ( name ) Is the quantity greater than 1 Duplicate data
SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1# If the quantity is greater than 1 Duplicate data SELECT * FROM stu WHERE `name` IN(SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1 )
4) View the duplicate data of a field id
SELECT id, COUNT(*) FROM stu GROUP BY NAME DESC HAVING(COUNT(*) > 0)
5) Query all duplicate data
SELECT * FROM stu WHERE NAME IN (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1)
5) duplicate removal
have access to distinct duplicate removal ( Returns a non duplicate user name )
Delete redundant duplicate records (name), Only keep id The biggest record .
DELETE FROM stu WHERE id NOT IN ( SELECT a.id FROM ( SELECT MAX( id ) AS id FROM stu GROUP BY `name` )a )perhaps
DELETE FROM stu WHERE `name` IN (SELECT `name` FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) e) AND id NOT IN (SELECT id FROM (SELECT MAX(id) AS id FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) t) # Queries show duplicate data in the first few lines , Therefore, there is no need to query for a minimum Error deletion
DELETE FROM stu WHERE name IN (SELECT name FROM stu GROUP BY name HAVING COUNT(name)>1)
AND id NOT IN (SELECT MAX(id) FROM stu GROUP BY stu HAVING COUNT(name)>1)
as a result of : Do not directly investigate the data as a condition of data deletion , We should start by creating a new temporary table with the data we find , The temporary table is then dropped as a condition
4 summaryName record after de duplication
SELECT `name` FROM stu GROUP BY NAME HAVING(COUNT(*) > 0)2)
All records with duplicate names
SELECT `name` FROM stu GROUP BY NAME HAVING COUNT(*) > 13) Delete all duplicate records
DELETE FROM stu WHERE
nameIN
(SELECTnameFROM stu GROUP BYnameHAVING COUNT(*)>1)

You cannot query this table at the same time when deleting , This is the only problem MySQL It appears that ,oracle No, . How to solve ? We just need to add an intermediate table after finding out the results . Let the actuator think that the data we want to check is not from the table being deleted .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a)All duplicate data is deleted , There is only one piece of data left

4) Now it's done to delete all duplicate data , Consider how to keep duplicate data id The smallest . You only need to delete the record when deleting id No duplicate data id The smallest one is OK .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a) AND id NOT IN (SELECT b.id FROM (SELECT MIN(id) id FROM stu GROUP BY `name` HAVING COUNT(*)>1) b);
There are simple ways Calculate all the data after re calculation ( Keep the minimum ID), Then delete id Not in the array
DELETE FROM stu WHERE id NOT IN (SELECT t.id FROM (SELECT MIN(id) AS id FROM stu GROUP BY `name`)t)This is about SQL That's all for the article on deleting duplicate data , More about SQL To delete duplicate data content, please search the previous articles of software development network or continue to browse the relevant articles below. I hope you will support software development network more in the future !
边栏推荐
- Learnopongl notes (II) - Lighting
- 【性能测试】jmeter+Grafana+influxdb部署实战
- How can C TCP set heartbeat packets to be elegant?
- C# TCP如何设置心跳数据包,才显得优雅呢?
- Embedded-c Language-2
- 【剑指 Offer】63. 股票的最大利润
- Use byte stream to read Chinese from file to console display
- dried food! Semi supervised pre training dialogue model space
- 7.Scala类
- The second day of learning C language for Asian people
猜你喜欢

Embedded-c Language-1

Browser rendering principle and rearrangement and redrawing

【jmeter】jmeter脚本高级写法:接口自动化脚本内全部为变量,参数(参数可jenkins配置),函数等实现完整业务流测试

dried food! Semi supervised pre training dialogue model space

thinkphp3.2.3

npm安装

URP下Alpha从Gamma空间到Linner空间转换(二)——多Alpha贴图叠加

美国芯片傲不起来了,中国芯片成功在新兴领域夺得第一名

国内首家 EMQ 加入亚马逊云科技「初创加速-全球合作伙伴网络计划」

The two ways of domestic chip industry chain go hand in hand. ASML really panicked and increased cooperation on a large scale
随机推荐
兰空图床苹果快捷指令
[Jianzhi offer] 61 Shunzi in playing cards
Excuse me, is the redis syntax used in DMS based on the commands of the redis community version of the cloud database
网站页面禁止复制内容 JS代码
Embedded UC (UNIX System Advanced Programming) -1
ThoughtWorks global CTO: build the architecture according to needs, and excessive engineering will only "waste people and money"
基于Redis实现延时队列的优化方案小结
云安全日报220705:红帽PHP解释器发现执行任意代码漏洞,需要尽快升级
微信公众号网页授权登录实现起来如此简单
composer安装报错:No composer.lock file present.
[Jianzhi offer] 62 The last remaining number in the circle
Q2 encryption market investment and financing report in 2022: gamefi becomes an investment keyword
SQL删除重复数据的实例教程
Writing method of twig array merging
Read the basic grammar of C language in one article
Twig数组合并的写法
通过proc接口调试内核代码
C# TCP如何限制单个客户端的访问流量
Detailed explanation of printf() and scanf() functions of C language
激动人心!2022开放原子全球开源峰会报名火热开启!