当前位置:网站首页>Example tutorial of SQL deduplication
Example tutorial of SQL deduplication
2022-07-05 17:14:00 【1024 questions】
1 SQL duplicate removal
2 distinct
3 group by
1. Query the data after de duplication according to the name ( With the same name id Worth a lot of money )
2. Delete data with the same name ( Keep the same name id Worth a lot of money )
4 summary
1 SQL duplicate removalSQL To remove identical data, you can use distinct keyword , Any field can be de duplicated with group by, Take the following data table as an example .
2 distinctThere are two identical records , With keywords distinct You can get rid of
De duplicate according to a single field , It can accurately remove the weight ;
When acting on multiple fields , Only if these fields are exactly the same , To go heavy ;
keyword distinct Only on the SQL The first one in the sentence , It works
It is generally used to return the number of non duplicate records , Returns the number of non duplicate entries ( Get rid of test Repetitive , That's all 6 strip )
SELECT * FROM stu WHERE id IN (SELECT MAX(id) FROM stu GROUP BY `name`)
group by + count + max Remove duplicate data
1)SELECT * FROM stu
2) add group by after , Will remove duplicate data
3) Conditions ( name ) Is the quantity greater than 1 Duplicate data
SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1# If the quantity is greater than 1 Duplicate data SELECT * FROM stu WHERE `name` IN(SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1 )
4) View the duplicate data of a field id
SELECT id, COUNT(*) FROM stu GROUP BY NAME DESC HAVING(COUNT(*) > 0)
5) Query all duplicate data
SELECT * FROM stu WHERE NAME IN (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`) > 1)
5) duplicate removal
have access to distinct duplicate removal ( Returns a non duplicate user name )
Delete redundant duplicate records (name), Only keep id The biggest record .
DELETE FROM stu WHERE id NOT IN ( SELECT a.id FROM ( SELECT MAX( id ) AS id FROM stu GROUP BY `name` )a )
perhaps
DELETE FROM stu WHERE `name` IN (SELECT `name` FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) e) AND id NOT IN (SELECT id FROM (SELECT MAX(id) AS id FROM stu GROUP BY `name` HAVING COUNT(`name`)>1) t) # Queries show duplicate data in the first few lines , Therefore, there is no need to query for a minimum
Error deletion
DELETE FROM stu WHERE name
IN (SELECT name
FROM stu GROUP BY name
HAVING COUNT(name
)>1)
AND id NOT IN (SELECT MAX(id) FROM stu GROUP BY stu
HAVING COUNT(name
)>1)
as a result of : Do not directly investigate the data as a condition of data deletion , We should start by creating a new temporary table with the data we find , The temporary table is then dropped as a condition
4 summaryName record after de duplication
SELECT `name` FROM stu GROUP BY NAME HAVING(COUNT(*) > 0)
2)
All records with duplicate names
SELECT `name` FROM stu GROUP BY NAME HAVING COUNT(*) > 1
3) Delete all duplicate records
DELETE FROM stu WHERE
name
IN
(SELECTname
FROM stu GROUP BYname
HAVING COUNT(*)>1)
You cannot query this table at the same time when deleting , This is the only problem MySQL It appears that ,oracle No, . How to solve ? We just need to add an intermediate table after finding out the results . Let the actuator think that the data we want to check is not from the table being deleted .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a)
All duplicate data is deleted , There is only one piece of data left
4) Now it's done to delete all duplicate data , Consider how to keep duplicate data id The smallest . You only need to delete the record when deleting id No duplicate data id The smallest one is OK .
DELETE FROM stu WHERE `name` IN (SELECT a.name FROM (SELECT `name` FROM stu GROUP BY `name` HAVING COUNT(*)>1) a) AND id NOT IN (SELECT b.id FROM (SELECT MIN(id) id FROM stu GROUP BY `name` HAVING COUNT(*)>1) b);
There are simple ways Calculate all the data after re calculation ( Keep the minimum ID), Then delete id Not in the array
DELETE FROM stu WHERE id NOT IN (SELECT t.id FROM (SELECT MIN(id) AS id FROM stu GROUP BY `name`)t)
This is about SQL That's all for the article on deleting duplicate data , More about SQL To delete duplicate data content, please search the previous articles of software development network or continue to browse the relevant articles below. I hope you will support software development network more in the future !
边栏推荐
- WSL2.0安装
- CMake教程Step4(安装和测试)
- ThoughtWorks global CTO: build the architecture according to needs, and excessive engineering will only "waste people and money"
- Using C language to realize palindrome number
- What is ROM
- ECU简介
- Learnopongl notes (II) - Lighting
- 張平安:加快雲上數字創新,共建產業智慧生態
- Etcd build a highly available etcd cluster
- Read the basic grammar of C language in one article
猜你喜欢
China Radio and television officially launched 5g services, and China Mobile quickly launched free services to retain users
Application of threshold homomorphic encryption in privacy Computing: Interpretation
Embedded -arm (bare board development) -2
[Web attack and Defense] WAF detection technology map
NPM installation
[first lecture on robot coordinate system]
Etcd build a highly available etcd cluster
Jarvis OJ Flag
7.Scala类
Judge whether a number is a prime number (prime number)
随机推荐
The second day of learning C language for Asian people
外盘期货平台如何辨别正规安全?
【729. 我的日程安排錶 I】
Rider 设置选中单词侧边高亮,去除警告建议高亮
Embedded UC (UNIX System Advanced Programming) -2
【二叉树】根到叶路径上的不足节点
First day of learning C language
机器学习02:模型评估
编译libssh2报错找不到openssl
How does the outer disk futures platform distinguish formal security?
[first lecture on robot coordinate system]
中国广电正式推出5G服务,中国移动赶紧推出免费服务挽留用户
Is it safe and reliable to open futures accounts on koufu.com? How to distinguish whether the platform is safe?
高数 | 旋转体体积计算方法汇总、二重积分计算旋转体体积
The first EMQ in China joined Amazon cloud technology's "startup acceleration - global partner network program"
Deeply cultivate 5g, and smart core continues to promote 5g applications
【剑指 Offer】62. 圆圈中最后剩下的数字
【7.7直播预告】《SaaS云原生应用典型架构》大咖讲师教你轻松构建云原生SaaS化应用,难题一一击破,更有华为周边好礼等你领!
7.Scala类
Timestamp strtotime the day before or after the date