当前位置:网站首页>Integrating database Ecology: using eventbridge to build CDC applications
Integrating database Ecology: using eventbridge to build CDC applications
2022-07-28 20:48:00 【Alibaba cloud native】
author : Chang Feng
introduction
CDC(Change Data Capture) It refers to monitoring upstream data changes , And synchronize the change information to the downstream business for further processing . Event driven architecture in recent years (EDA) The heat rises gradually , Increasingly become the first choice of project architecture designers .EDA Natural fit CDC The underlying infrastructure of , It treats data changes as events , Each service completes a series of business drivers by listening to events of interest to itself . Alibaba cloud EventBridge It is a serverless event bus service launched by Alibaba cloud , It can help users easily and quickly build a platform based on EDA Application of Architecture . In the near future ,EventBridge Event flow has been supported based on Alibaba cloud DTS [1] Service CDC Ability . This article will start from CDC、CDC stay EventBridge And some best practice scenarios , Introduce how to use EventBridge Easy to build CDC application .
CDC summary
Basic principles and application scenarios
CDC Capture incremental data and data schema changes from the source database , With high reliability 、 Low latency data transmission synchronizes these changes to the target database in an orderly manner 、 Data lake or other data analysis services . At present, open source is the mainstream in the industry CDC Tools include Debezium [2] 、Canal [3] as well as Maxwell [4] .

picture source :https://dbconvert.com
At present, there are mainly the following categories in the industry CDC The implementation of the :
1. Based on timestamp or version number
The timestamp based approach requires that the database table have a field representing the update timestamp , When there is data insertion or update , The corresponding timestamp field will be updated .CDC The component periodically retrieves data records whose update time is greater than the last synchronization time , Data changes in this cycle can be captured . The principles of version number based tracking and timestamp based tracking are basically the same , When the developer is required to change the data, the version number of the data must be updated .
2. Based on snapshot
Snapshot based CDC The implementation uses data sources at the storage level 3 Copies , It's the raw data 、 Previous snapshot and current snapshot . by force of contrast 2 Get the data changes between the snapshots .
3. Based on triggers
Trigger based CDC In fact, the implementation method is to create triggers on the source table to change the data (INSERT、UPDATE、DELETE) Records are stored . For example, a special table is established to record the user's change operations , Then create INSERT、UPDATE、DELETE Three types of triggers synchronize user changes to this table .
4. Log based
The above three methods are invasive to the source database , The log based approach is non intrusive CDC The way . The database uses transaction log to realize disaster recovery , for example MySQL Of binlog It records all user changes to the database . Log based CDC Continuously monitor the transaction log to get the changes of the database in real time .
CDC There are a wide range of application scenarios , Including but not limited to these aspects : Database synchronization in remote computer rooms 、 Heterogeneous database data synchronization 、 Microservice decoupling 、 Cache updates and CQRS etc. .
Based on Alibaba cloud CDC Solution :DTS
Data transmission service DTS(Data Transmission Service) It is a real-time data flow service provided by Alibaba cloud , Support for relational databases (RDBMS)、 Non relational databases (NoSQL)、 Multidimensional data analysis (OLAP) Wait for data interaction between data sources , Set data synchronization 、 transfer 、 subscribe 、 Integrate 、 Processing in one . among ,DTS Data subscription [5] The function can help users get self built MySQL、RDS MySQL、Oracle Wait for real-time incremental data of the database .

CDC stay EventBrige Application on
Alibaba cloud EventBridge Provides an event bus [6] And event flow [7] 2 An event routing service for different application scenarios .
The bottom layer of the event bus has the ability to persist events , Events can be routed to multiple event targets as needed .
Event flow is suitable for end-to-end streaming data processing scenarios , Real time extraction of events generated at the source 、 Transform and analyze and load to the target , There is no need to create an event bus , End to end dump is more efficient , It's lighter to use .
In order to better support users in CDC Requirements under the scenario ,EventBridge Alibaba cloud is supported at the source of event flow DTS Data subscription function , Users only need simple configuration , You can synchronize the database change information to EventBridge Flow of events .

EventBridge Customized based on DTS sdk Of DTS Source Connector. When the user configures the event provider as DTS Event flow of ,source connector From DTS The server pulls DTS record data . After the data is pulled locally , Will carry out certain structural packaging , Retain id、operationType、topicPartition、beforeImage、afterImage Data such as , At the same time increase streaming event Some system properties required .
DTS Event Examples can be found in EventBridge Official documents

EventBridge Streaming To ensure the DTS The sequence of events , But there is the possibility of repeated delivery of events ,EventId In the guarantee and every DTS record One to one mapping of , Users can use this field to idempotent the event .
The creation source is DTS Of EventBridge Flow of events
Here's how to EventBridge The console creation source is DTS Event flow for
- Preparation
Opening EventBridge service ;
establish DTS Data subscription task ;
Create consumption group account information for consumption subscription data .
- Create event flow
land EventBridge Console , Click on the left navigation bar , choice “ Flow of events ”, On the event flow list page, click “ Create event flow ”;
“ essential information ” in “ Event stream name ” And “ describe ” Fill in as needed ;
Creating an event flow , When selecting an event provider , Select... From the drop-down box “ database DTS”;
stay “ Data subscription task ” Select the created DTS Data subscription task . In the consumption group column , Choose which consumer group to use to consume subscription data , At the same time, fill in the consumption group password and the initial consumption time .

- Fill in the event flow rules and objectives as needed , Save startup to create DTS Data subscription is the event flow of event source .

matters needing attention
There are several points to pay attention to when using :
EventBridge It uses SUBSCRIBE Consumption patterns [8] , So please ensure that the current DTS There are no other client instances running in the consumer group . If the set consumption group has been running before , Then the incoming site fails , The consumption will continue based on the last consumption point of this consumption group ;
establish DTS The site passed in at the time of event source is only effective when the new consumption group runs for the first time , After the subsequent tasks are restarted, they will continue to consume based on the last consumption point ;
EventBridge Event stream subscription OperationType by INSERT、DELETE、UPDATE、DDL Type of DTS data ;
Use DTS The event source may have repeated messages , That is to ensure that the message is not lost , But there is no guarantee that it will be delivered only once , It is recommended that users do idempotent processing ;
5. If users need to ensure sequential consumption , You need to set the exception tolerance policy to “NONE”, That is, do not tolerate exceptions . under these circumstances , If the consumption message at the target end of the event flow is abnormal , The entire event flow will pause , Until the target side returns to normal .
Examples of best practices
be based on EventBridge Realization CQRS
stay CQRS(Command Query Responsibility Segregation) In the model , The command model is used to perform write and update operations , The query model is used to support efficient read operations . There is a certain difference in the data model used by read operations and write operations , You need to use a certain way to ensure data synchronization , be based on EventBridge The flow of events CDC Can meet such needs .
Based on cloud services , Users can use the following methods to easily build based on EventBridge Of CQRS:
Command the model to operate the database to make changes , Query model read elasticsearch get data ;
Turn on DTS Data subscription task , Capture DB Change content ;
3. To configure EventBridge Flow of events , The event provider is DTS Data subscription task , The event receiver calculates for the function FC;
- FC The service in is update elasticsearch Data manipulation .

Microservice decoupling
CDC It can also be used for micro service decoupling . For example, the following is an order processing system of an e-commerce platform , When a new unpaid order is generated , The database will have a INSERT operation , When the status of an order is changed from “ Unpaid ” Turn into “ Paid ” when , The database will have a UPDATE operation . According to the change of order status , The backend will have different microservices to handle this .
Users to place the order / payment , The order system performs business processing , Write data changes to DB;
newly build DTS Subscribe to task capture DB Data changes ;
build EventBridge Flow of events . The event provider is DTS Data subscription task , The event receiver is RocketMQ;
In the consumer RocketMQ Data time , The same topic Next enable 3 individual group Represent different business consumption logic ;
a. GroupA What will be captured DB Change user cache update , It is convenient for users to query the order status ;
b. GroupB Downstream related financial system , Only new orders are processed , Deal with DB The type of operation is INSERT Events , Discard the remaining types of events ;
c. GroupC Only the order status is concerned by “ Unpaid ” Turn into “ Paid ” Events , When a qualified event arrives , Call downstream logistics 、 Storage system , Further process the order .
If the interface call method is adopted , After the user places an order, the order system will need to call the cache update interface 、 New order interface and order payment interface , Business coupling is too high . besides , This mode enables the data consumer not to worry about the semantic information of the content returned by the upstream order processing interface , With the storage model unchanged , Directly judge whether this data change needs to be handled and how it needs to be handled from the data level . meanwhile , The natural message stacking ability of message queue can also help users realize business peak shaving and valley filling when the order peak comes .
in fact , at present EventBridge Streaming Supported messaging products also include RabbitMQ、Kafka、MNS etc. , In practice, users can choose according to their own needs .

Database backup & Heterogeneous database synchronization
Database disaster recovery and heterogeneous database data synchronization are also CDC Important application scenarios . Use alicloud EventBridge You can also quickly build such applications .
newly build DTS Data subscription task , Capture users MySQL Database changes ;
build EventBridge Flow of events , The event provider is DTS Data subscription task ;
Use EventBridge Execute the assignment in the destination database sql, Implement database backup ;
Data change events are posted to function calculation , The user business updates the corresponding heterogeneous database according to the content of data changes .

build by oneself SQL Audit
For users, there are self built SQL Audit needs , Use EventBridge It can also be easily realized .
newly build DTS Data subscription task , Capture database changes ;
build EventBridge Flow of events , The event provider is DTS, The event receiver is the log service SLS;
Users need to be aware of SQL During the audit , By inquiring SLS Conduct .

summary
This paper introduces CDC Some of the concepts of 、CDC stay EventBridge And some best practice scenarios . With the increasing number of support products ,EventBridge The ecological landscape is also expanding , From message ecology to database ecology , From log ecosystem to big data ecosystem ,EventBridge Continuously expand its application fields , Consolidate the position of cloud event hub , After that, it will continue to develop in this direction , Deep technology , Ecological expansion .
Reference link :
[1] DTS:
https://www.aliyun.com/product/dts
[2] Debezium:
[3] Canal:
https://github.com/alibaba/canal
[4] Maxwell:
https://github.com/zendesk/maxwell
[5] DTS Data subscription :
https://help.aliyun.com/document_detail/145716.html
[6] Event bus :
https://help.aliyun.com/document_detail/163897.html
[7] Flow of events :
https://help.aliyun.com/document_detail/329940.html
[8] SUBSCRIBE Consumption patterns :
https://help.aliyun.com/document_detail/223371.html
Interested partners can scan the QR code below to join the nail group discussion ( Group number :44552972)

Click on here , Get into EventBridge Learn more on our website ~
边栏推荐
- JS drag and drop alert pop-up plug-in
- UE4 3dui widget translucent rendering blur and ghosting problems
- SQL审核工具自荐Owls
- Nocturnal simulator settings agent cannot be saved
- 华为云数字资产链,“链”接数字经济无限精彩
- Network shell
- UE4.25 Slate源码解读
- How to make the design of governance structure more flexible when the homogenization token is combined with NFT?
- 太空射击第14课: 玩家生命
- 网络各层性能测试
猜你喜欢

System. ArgumentException: Object of type ‘System. Int64‘ cannot be converted to type ‘System.Int32‘

LVS load balancing cluster

JS chart scatter example

Use of DDR3 (axi4) in Xilinx vivado (2) read write design

EasyNLP中文文图生成模型带你秒变艺术家

Unity object path query tool

How to balance security and performance in SQL?

融合数据库生态:利用 EventBridge 构建 CDC 应用

Use of DDR3 (axi4) in Xilinx vivado (4) incentive design

想画一张版权属于你的图吗?AI作画,你也可以
随机推荐
UE4.25 Slate源码解读
Unity gets which button (toggle) is selected under togglegroup
[task02: SQL basic query and sorting]
Cartoon JS shooting game source code
如何平衡SQL中的安全与性能?
3D激光SLAM:LeGO-LOAM论文解读---简介部分
Yum package management
Subcontracting loading of wechat applet
Seventeen year operation and maintenance veterans, ten thousand words long, speak through the code of excellent maintenance and low cost~
“当你不再是程序员,很多事会脱离掌控”—— 对话全球最大独立开源公司SUSE CTO...
Redis review summary
About the title of linking to other pages
太空射击第14课: 玩家生命
Unity package project to vs deploy hololens process summary
激光slam:LeGO-LOAM---代码编译安装与gazebo测试
Unity uses shader to quickly make a circular mask
太空射击第09课:精灵动画
MySQL error: specified key was too long; max key length is 767 bytes
LVM logical volume
关于链接到其他页面的标题