当前位置:网站首页>The magic of SQL MERGE statement (detailed instructions)
The magic of SQL MERGE statement (detailed instructions)
2022-07-31 13:59:00 【DebugUsery】
SQL MERGEStatement is a mysterious device,Its power after its strength. 根据标准SQL,A simple example shows the full power of it.想象一下,You have a product price production table,And a load you want from the latest price of the temporary table.这一次,我使用了Db2 LuW MERGE语法,Because this is the most accord with a standard grammar(在我们jOOQSupport the dialect of):
DROP TABLE IF EXISTS prices;
DROP TABLE IF EXISTS staging;
CREATE TABLE prices (
product_id BIGINT NOT NULL PRIMARY KEY,
price DECIMAL(10, 2) NOT NULL,
price_date DATE NOT NULL,
update_count BIGINT NOT NULL
);
CREATE TABLE staging (
product_id BIGINT NOT NULL PRIMARY KEY,
price DECIMAL(10, 2) NOT NULL
);
DELETE FROM prices;
DELETE FROM staging;
INSERT INTO staging
VALUES (1, 100.00),
(2, 125.00),
(3, 150.00);
复制代码
所以,We have in our temporary load some of the records in the table,We now want to merge them into the price list of.We can directly insert these records,很容易,But we will have more price,例如这些:
DELETE FROM staging;
INSERT INTO staging
VALUES (1, 100.00),
(2, 99.00),
(4, 300.00);
复制代码
We hope our logic is that:
- All the new in the staging tableIDShould be directly inserted into the price list.
- 现有的ID应该被更新,If and only if the price has changed.在这种情况下,update_count应该增加.
- In the temporary table no longer meet the price should be deleted from the price list,In order to achieve complete synchronization,Rather than delta synchronous,为了这个例子的目的.我们还可以添加一个 "命令 "列,Contains instructions about whether the data should be updated, or deleted,以实现delta同步.
所以,This is what we used to workDb2(符合标准)的MERGE语句:
MERGE INTO prices AS p
USING (
SELECT COALESCE(p.product_id, s.product_id) AS product_id, s.price
FROM prices AS p
FULL JOIN staging AS s ON p.product_id = s.product_id
) AS s
ON (p.product_id = s.product_id)
WHEN MATCHED AND s.price IS NULL THEN DELETE
WHEN MATCHED AND p.price != s.price THEN UPDATE SET
price = s.price,
price_date = CURRENT_DATE,
update_count = update_count + 1
WHEN NOT MATCHED THEN INSERT
(product_id, price, price_date, update_count)
VALUES
(s.product_id, s.price, CURRENT_DATE, 0);
复制代码
If you've not writtenMERGE语句,那就没那么简单了.如果是这样,不要害怕.和大多数SQL一样,Terrible part is grammar(关键字,UPPER CASE,等等).The underlying concept at first it seems to be simple.Let's step by step to understand it.它有4个部分:1.目标表就像INSERT语句一样,We can define where we want to combine the data.这是最简单的部分:
MERGE INTO prices AS p
-- ...
复制代码
2.源表USINGKeyword packing a we want to merge the source table.We can directly put the temporary table here,But I want to enrich the source data by some additional data.我使用FULL JOINTo generate the old data(价格)和新数据(暂存)之间的匹配.If after the second fill transfer table,但在运行MERGE语句之前,我们单独运行USING子句(To illustrate this to do some small changes):
SELECT
COALESCE(p.product_id, s.product_id) AS product_id,
p.price AS old_price,
s.price AS new_price
FROM prices AS p
FULL JOIN staging AS s ON p.product_id = s.product_id
ORDER BY product_id
复制代码
Then we will get the result:
PRODUCT_ID|OLD_PRICE|NEW_PRICE|
----------|---------|---------|
1| 100.00| 100.00| <-- same price
2| 125.00| 99.00| <-- updated price
3| 150.00| | <-- deleted price
4| | 300.00| <-- added price
复制代码
Neat!3.ON子句接下来,我们使用ONClause on the target table and the source tableRIGHT JOIN,就像普通JOIN一样:
ON (p.product_id = s.product_id)
复制代码
MERGE总是使用RIGHT JOIN的语义,That is why I placed in the source table aFULL JOIN,即USING子句.We can use different ways to write,So you can avoid twice visited the pricelist,But I wanted to show the full power of this statement.注意,SQL Server使用FULL JOINConnecting the source and target table,我将进一步解释.I will also explain why use immediatelyRIGHT JOIN**.**4.WHEN子句现在,有趣的部分来了!两个表(目标表和源表)Can have a match between,As we getINNER JOIN的结果一样,Or there is no such match,Because the source table contains a were not the target table matching record(RIGHT JOIN的语义).在我们的例子中,PRODUCT_ID IN (1, 2, 3)Will have a match(Included in the source and target table),而PRODUCT_ID = 4Will not produce a match(Is not included in the target table). As we source data set shading:
PRODUCT_ID|OLD_PRICE|NEW_PRICE|
----------|---------|---------|
复制代码
The following is a series of matching instruction,These instructions will be in accordance with the order of appearance,对前一个RIGHT JOIN产生的每一条记录进行执行:
-- With my FULL JOIN, I've produced NULL price values
-- whenever a PRODUCT_ID is in the target table, but not
-- in the source table. These rows, we want to DELETE
WHEN MATCHED AND s.price IS NULL THEN DELETE
-- When there is a price change (and only then), we
-- want to update the price information in the target table.
WHEN MATCHED AND p.price != s.price THEN UPDATE SET
price = s.price,
price_date = CURRENT_DATE,
update_count = update_count + 1
-- Finally, when we don't have a match, i.e. a row is
-- in the source table, but not in the target table, then
-- we simply insert it into the target table.
WHEN NOT MATCHED THEN INSERT
(product_id, price, price_date, update_count)
VALUES
(s.product_id, s.price, CURRENT_DATE, 0);
复制代码
It's not too complicated,Just have a lot of words and grammar.因此,In the second set of data of the transfer list to run on thisMERGE后,We will get the result in price list:
PRODUCT_ID|PRICE |PRICE_DATE|UPDATE_COUNT|
----------|------|----------|------------|
1|100.00|2020-04-09| 0|
2| 99.00|2020-04-09| 1|
4|300.00|2020-04-09| 0|
复制代码
I express thisMERGE语句的方式,It is with a,也就是说,I can run it again in the same transfer on the table of contents,It will not modify any data of price list--因为没有一个WHEN语句适用.Isotope is notMERGE的属性,I just write this statement.
The specific content of dialect
There are some dialect supportMERGE.在jOOQ 3.13Supported by the dialect in,至少有:
- Db2
- Derby
- Firebird
- H2
- HSQLDB
- Oracle
- SQL Server
- Sybase SQL Anywhere
- Teradata
- Vertica
遗憾的是,这一次,This list does not includePostgreSQL.But even this in the list of dialect,也没有就MERGEThe true meaning of agree.SQL标准规定了3个功能,Every one is optional:
- F312 MERGE 语句
- F313 增强型MERGE语句
- F314 带有DELETE分支的MERGE语句
但是,Rather than see standards and their requirements,See if there is dialect provides,And if there is no thing,How to imitate it.
AND子句
你可能已经注意到,This article USES the grammar.
WHEN MATCHED AND <some predicate> THEN
复制代码
也可以指定
WHEN NOT MATCHED AND <some predicate> THEN
复制代码
除了Teradata之外,Most of the dialect to support theseAND子句(Oracle有一个使用WHERE的特殊语法,I will be talking about later). The meaning of the clause is the ability to have several suchWHEN MATCHED或WHEN NOT MATCHED子句,In fact is an arbitrary number of clause.不幸的是,That's not all dialects support.Some of the dialects support each type of only one clause(INSERT, UPDATE, DELETE).严格来说,Support several clauses is not necessary,But as we will see below,它更方便. These dialects do not support multipleWHEN MATCHED或WHEN NOT MATCHED子句:
- HSQLDB
- Oracle
- SQL Server
- Teradata
If a dialect does not supportAND,Or does not support multipleWHEN MATCHED子句,Only need to translate these clauses intocase表达式.We will not get us beforeWHEN子句,而是:
-- The DELETE clause doesn't make much sense without AND,
-- So there's not much we can do about this emulation in Teradata.
WHEN MATCHED AND s.price IS NULL THEN DELETE
-- Repeat the AND clause in every branch of the CASE
-- Expression where it applies
WHEN MATCHED THEN UPDATE SET
price = CASE
-- Update the price if the AND clause applies
WHEN p.price != s.price THEN s.price,
-- Otherwise, leave it untouched
ELSE p.price
END
-- Repeat for all columns
price_date = CASE
WHEN p.price != s.price THEN CURRENT_DATE
ELSE p.price_date
END,
update_count = CASE
WHEN p.price != s.price THEN update_count + 1
ELSE p.update_count
END
-- Unchanged, in this case
WHEN NOT MATCHED THEN INSERT
(product_id, price, price_date, update_count)
VALUES
(s.product_id, s.price, CURRENT_DATE, 0);
复制代码
Formalism is such a:如果没有AND,则添加AND这些都是一样的:
WHEN MATCHED THEN [ UPDATE | DELETE ]
WHEN MATCHED AND 1 = 1 THEN [ UPDATE | DELETE ]
复制代码
在Firebird(It in this respect has abug)和SQL Server(It does not allow the noAND子句的WHEN MATCHEDAfter clauseWHEN MATCHED子句,It's a bit like alinting错误)May need to replace this.You can skip all subsequentWHEN MATCHED分支,Instead of imitating things,Because they do not apply.Each record update only a,Only through aWHEN子句:Each record update only a按照标准要求,Make sure no one record in the simulation is updated more than once.在写这个的时候:
WHEN MATCHED AND p1 THEN UPDATE SET c1 = 1
WHEN MATCHED AND p2 THEN DELETE
WHEN MATCHED AND p3 THEN UPDATE SET c1 = 3, c2 = 3
WHEN MATCHED AND p4 THEN DELETE
复制代码
This effectively means that with:
WHEN MATCHED AND p1 THEN UPDATE SET c1 = 1
WHEN MATCHED AND NOT p1 AND p2 THEN DELETE
WHEN MATCHED AND NOT p1 AND NOT p2 AND p3 THEN UPDATE SET c1 = 3,c2 = 3
WHEN MATCHED AND NOT p1 AND NOT p2 AND NOT p3 AND p4 THEN DELETE
复制代码
To simulate the situation,就写成这样:
WHEN MATCHED AND
p1 OR
NOT p1 AND NOT p2 AND p3
THEN UPDATE SET
c1 = CASE
WHEN p1 THEN 1
WHEN NOT p1 AND NOT p2 AND p3 THEN 3
ELSE c1
END,
c2 = CASE
WHEN NOT p1 AND NOT p2 AND p3 THEN 3
ELSE c2
END
WHEN MATCHED AND
NOT p1 AND p2 OR
NOT p1 AND NOT p2 AND NOT p3 AND p4
THEN DELETE
复制代码
相当费劲,但事实就是如此.
H2和HSQLDB
注意,H2和HSQLDB都没有遵循 "Each row update only a "的规则.我已经向H2Report it:https://github.com/h2database/h2database/issues/2552.If you want to accord with a standard(jOOQ 3.14To simulate this for you,不用担心),Then you must be in the dialect frantically to perform the aboveCASE表达式,或者,在H2中(HSQLDBDoes not support the same types of multipleWHEN MATCHED子句)Strengthen allWHEN MATCHED AND子句,As I show before:
WHEN MATCHED AND p1 THEN UPDATE SET c1 = 1
WHEN MATCHED AND NOT p1 AND p2 THEN DELETE
WHEN MATCHED AND NOT p1 AND NOT p2 AND p3 THEN UPDATE SET c1 = 3,c2 = 3
WHEN MATCHED AND NOT p1 AND NOT p2 AND NOT p3 AND p4 THEN DELETE
复制代码
Oracle
OracleHere does not supportAND,But support some interesting vendor specific grammar.At first glance is very reasonable,But it's really strange:
- 在UPDATE之后,你可以添加一个WHERE子句,这和AND是一回事.到目前为止还不错.
- 你也可以添加一个DELETE WHERE子句,但只能和UPDATE一起.So you can't in the case of not updateDELETE.好吧,在我们的例子中,We don't intend to do so.
- 然而,有趣的是,UPDATE/DELETE命令是一起执行的,而且DELETE发生在UPDATE之后.So the same row is processed twice.如果你在UPDATE中使用WHERE,那么只有UPDATEContained in the records can beDELETE中包含.我的意思是,Why do you want to update before you delete records?
这意味着,Our standard clause:
WHEN MATCHED AND p1 THEN UPDATE SET c1 = 1
WHEN MATCHED AND p2 THEN DELETE
WHEN MATCHED AND p3 THEN UPDATE SET c1 = 3, c2 = 3
WHEN MATCHED AND p4 THEN DELETE
复制代码
Will need to be copied like this:
WHEN MATCHED
THEN UPDATE SET
c1 = CASE
WHEN p1 THEN 1 -- Normal update for WHEN MATCHED AND p1 clause
WHEN p2 THEN c1 -- "Touch" record for later deletion
WHEN p3 THEN 3 -- Normal update for WHEN MATCHED AND p3 clause
WHEN p4 THEN c1 -- "Touch" record for later deletion
ELSE c1
END,
c2 = CASE
WHEN p1 THEN c2 -- p1 is not affecting c2
WHEN p2 THEN c2 -- "Touch" record for later deletion
WHEN p3 THEN 3 -- Normal update for WHEN MATCHED AND p3 clause
WHEN p4 THEN c2 -- "Touch" record for later deletion
ELSE c2
END
-- Any predicate from any AND clause, regardless if UPDATE or DELETE
WHERE p1 OR p2 OR p3 OR p4
-- Repeat the predicates required for deletion
DELETE WHERE
NOT p1 AND p2 OR
NOT p1 AND NOT p2 AND NOT p3 AND p4
复制代码
This is just a simple standardSQL语法的MERGE语句!There is an additional problem,I will not involve in this blog(但我们可能会在jOOQ中处理).在Oracle中,DELETE WHEREClauses can already seeUPDATEClause is executed by the update.这意味着,如果,例如,p2依赖于c1的值:
c1 = CASE
...
WHEN p2 THEN c1 -- "Touch" record for later deletion
...
END,
复制代码
那么在DELETE WHERE中对p2Assessment will be affected by the:
DELETE WHERE
NOT p1 AND p2 OR
NOT p1 AND NOT p2 AND NOT p3 AND p4
复制代码
这些p2表达式中的c1将与UPDATE子句中的c1不一样.显然,To some extent can manage this problem by variable substitution.
SQL Server BY SOURCE 和 BY TARGET
SQL Server对WHEN NOT MATCHEDThe expansion of the clause has a useful,I think this belong toSQLThe category of the standard 有了这个扩展,You can specify is inWHEN NOT MATCHED [BY TARGET](Others support the default value of)时执行INSERT操作,还是在WHEN NOT MATCHED BY SOURCE(在这种情况下,You can perform anotherUPDATE或DELETE操作. BY TARGETClause means that we find a record in the source table,But not in the target table.BY SOURCEClause means that we in the target table found a record,But not in the source table.这意味着在SQL Server中,The target table and the source table isFULL OUTER JOINed,而不是RIGHT OUTER JOINed,This would mean that our original statement can be greatly simplified.
MERGE INTO prices AS p
USING staging AS s
ON (p.product_id = s.product_id)
WHEN NOT MATCHED BY SOURCE THEN DELETE
WHEN MATCHED AND p.price != s.price THEN UPDATE SET
price = s.price,
price_date = getdate(),
update_count = update_count + 1
WHEN NOT MATCHED BY TARGET THEN INSERT
(product_id, price, price_date, update_count)
VALUES
(s.product_id, s.price, getdate(), 0);
复制代码
We can meet here again coloring:
PRODUCT_ID| P.PRICE| S.PRICE|
----------|---------|---------|
复制代码
可以看出,这其实就是FULL OUTER JOIN的工作方式. Will be simulated these clauses into standardSQLAlso is very hard,Because we must explicitly simulated theFULL OUTER JOIN.I think this is possible,But we may not be injOOQ中实现它.
Vertica
只有Vertica似乎不支持DELETE分支,这意味着你不能用MERGEStatements from the target in the tableDELETE数据.You can only use it toINSERT或UPDATE数据,In almost all cases are good enough.奇怪的是,Teradata支持DELETE,但不支持AND,This seems a bit pointless,因为DELETE和UPDATECannot be combined in this way.
结论
MERGEStatement is a mysterious device,The force behind it.In the form of a simple(没有AND或WHERE子句,没有DELETE子句),Almost all of the dialect agree,This is a very useful feature sets,jOOQHas support for a long time.从jOOQ 3.14开始,We will handle this article list all of the other functions,In order to help you write complex、Has nothing to do with the manufacturers ofMERGE语句,并在所有支持MERGEThe dialect of simulate them. Now you want to play?请看Our free onlineSQL翻译工具.
边栏推荐
- 1小时直播招募令:行业大咖干货分享,企业报名开启丨量子位·视点
- numpy矩阵和向量的保存与加载,以及使用保存的向量进行相似度计算
- The importance of strategic offensive capability is much higher than strategic defensive capability
- Unity study notes Description of AVPro video jump function (Seeking)
- IDEA can't find the Database solution
- 技能大赛训练题:交换机虚拟化练习
- An article makes it clear!What is the difference and connection between database and data warehouse?
- Uniapp WeChat small application reference standard components
- 对数字化时代的企业来说,数据治理难做,但应该去做
- Shell项目实战1.系统性能分析
猜你喜欢
How IDEA runs web programs
我把问烂了的MySQL面试题总结了一下
为什么要分库分表?
A detailed explanation of the usage of Async and Await in C#
uniapp微信小程序引用标准版交易组件
ICML2022 | Fully Granular Self-Semantic Propagation for Self-Supervised Graph Representation Learning
AI cocoa AI frontier introduction (7.31)
已解决(pymysqL连接数据库报错)pymysqL.err.ProgrammingError: (1146,“Table ‘test.students‘ doesn‘t exist“)
The JVM a class loader
ICML2022 | 面向自监督图表示学习的全粒度自语义传播
随机推荐
VU 非父子组件通信
ERROR: Failed building wheel for osgeo
hyperf的启动源码分析(二)——请求如何到达控制器
[QNX Hypervisor 2.2用户手册]9.14 safety
图像大面积缺失,也能逼真修复,新模型CM-GAN兼顾全局结构和纹理细节
ICML2022 | Fully Granular Self-Semantic Propagation for Self-Supervised Graph Representation Learning
The latest complete code: Incremental training using the word2vec pre-training model (two loading methods corresponding to two saving methods) applicable to various versions of gensim
「面经分享」西北大学 | 字节 生活服务 | 一面二面三面 HR 面
C# control ToolStripProgressBar usage
LeetCode rotate array
Spark学习:为Spark Sql添加自定义优化规则
页面整屏滚动效果
Motion capture system for end-positioning control of flexible manipulators
leetcode:485.最大连续 1 的个数
[QNX Hypervisor 2.2用户手册]9.13 rom
清除浮动的四种方式及其原理理解
Comparison of Optical Motion Capture and UWB Positioning Technology in Multi-agent Cooperative Control Research
el-tooltip的使用
I summed up the bad MySQL interview questions
Save and load numpy matrices and vectors, and use the saved vectors for similarity calculation