当前位置:网站首页>Canal introduction
Canal introduction
2022-07-27 00:53:00 【User 1483438】
canal effect
canal It's an open source project of Alibaba , pure Java Development . Incremental log parsing based on Database , Provide incremental data subscription & consumption , At present, it mainly supports MySQL( Also support mariaDB). canal It's just one. A tool for synchronizing incremental data .
canal background
canal [kə'næl], Waterways / The Conduit / Ditch , The main use is based on MySQL Database incremental log parsing , Provide incremental data subscription and consumption
In the early days, Alibaba was deployed due to the dual computer rooms in Hangzhou and the United States , There is a business demand for synchronization across computer rooms , The implementation is mainly based on the business trigger Get incremental changes . from 2010 Year begins , The business gradually tries to get incremental changes from database log parsing for synchronization , Thus, a large number of incremental database subscription and consumption businesses are derived .
Incremental subscription and consumption based on logs include
- database mirroring
- Database real-time backup
- Index building and real-time maintenance ( Split heterogeneous indexes 、 Inverted index, etc )
- Business cache Refresh
- Incremental data processing with business logic Current canal Support source side MySQL The version includes 5.1.x , 5.5.x , 5.6.x , 5.7.x , 8.0.x
come from Alibaba official Introduce
canal Application scenarios
- Send the user's order information to the background .
- The background server saves the order information to mysql database .
- also canal monitor mysql Write operation changes in , Will be modified (
Insert) Data written to kafka - adopt
sparkStreamingRead Kafka Data in , Calculate . - The calculated result , Rewrite to the server , And return to the browser .
demand
According to the demand , Think about how to deal with ?
Get user's order information , Save stock in , And carry out real-time calculation (A Number of commodity transaction orders ,B Number of commodity transaction orders ...).
Why use canal?
Through the demand, we know that we need to calculate the order information , And real-time statistics of the transaction information of each commodity , At this time A goods 100 strip ,B goods 20 strip . This product information is dynamic , Every time a user submits , The background should calculate the latest commodity order records . So we need to accumulate in real time , For example, another batch of goods , among A goods 10 strip ,B goods 5 strip ,C goods 10 strip . Then the result of the display is A110 strip ,B 30 strip .C5 strip . But it is impossible to start again , in other words ,A+B+C =145 Medium data , Re divide A,B,C Information of each commodity . Instead, we should accumulate Like the last time A goods 100 And this time A Commodity 10 strip . Such efficiency is higher .
canal The role of
After understanding the above reasons , Let's talk about
canalThe role of , It can realize incremental synchronization , Or take A Commodity examples , In the first batch of data ,A The goods are 100 strip ,canalThis new batch of data will be written intoKafka, Give it backsparkHandle . Another batch of data is added for the second time , thereforecanalWrite the monitored new data toKafkain . By analogy , In the end by thesparkThe calculated result is returned .
canal Except for writing kafka It can also write data to other middleware (mysql、elasticsearch、hbase etc. )
working principle website
MySQL Principle of active / standby replication
- MySQL master Write data changes to binary log ( binary log, The records are called binary log events binary log events, Can pass show binlog events To view the )
- MySQL slave take master Of binary log events Copy to its trunk log (relay log)
- MySQL slave replay relay log Middle event , Change the data to reflect its own data
canal working principle
- canal simulation MySQL slave Interaction protocol , Pretend to be MySQL slave , towards MySQL master send out dump agreement
- MySQL master received dump request , Start pushing binary log to slave ( namely canal )
- canal analysis binary log object ( Originally byte flow )
BinaryLog
WAL:
- hbase When writing , First record the command of write operation in
WALIn the log , Then write the data tomemstore, This is to regionSerever After the process exits abnormally and restarts , Restore data , This mechanism is calledWAL.WAL journal: Call it a backup of write commands - NameNode During operation , The metadata generated by the client in real time will be recorded in
editsIn file , staynamenodeWhen restarting , Put the previousfsimageDocument andeditsMerge to get the latest element data .edtis: Client write operation named backup - mysql Can be opened
biglogLogging function , Every time after openingmysqlServer send Write operations command , Will be recorded in a special file , This special file nine is calledbiglogjournal .biglogjournal : Client write operation named backup , Its purpose is also for the abnormal exit of the server , Restore data !
RelayLog
Slave (slave) How to synchronize the host (master) What about data ? He will start two threads (I/O thread、SQL thread),
- I/O thread: Copy read host (master) Write to
binary logWhat's new . And write the content torelay logs (relay log)in , Used for temporary buffer . - SQL thread: For reading
relay logs (relay log)Medium data of , And implement , suchSlave (slave)There is andhost (master)The same operation .
canal working principle ( My personal understanding )
canal Like a " The spy ", It disguises itself as a
Slave (slave), fromhost (master)Swindle data . After getting the data , And then analyze the data , such as · Just add new data or delete or modify data , And repackage the data , Rewrite the data into the third-party service (mysql、Kafka、es etc. ).
边栏推荐
猜你喜欢

logback自定义MessageConverter
![[leetcode] no duplicate longest string](/img/97/bf8c9b019136ab372ce2c43cddbb2c.jpg)
[leetcode] no duplicate longest string
![[CISCN2019 华北赛区 Day1 Web2]ikun](/img/80/53f8253a80a80931ff56f4e684839e.png)
[CISCN2019 华北赛区 Day1 Web2]ikun
![[watevrCTF-2019]Cookie Store](/img/24/8baaa1ac9daa62c641472d5efac895.png)
[watevrCTF-2019]Cookie Store

DOM day_ 02 (7.8) web page production process, picture SRC attribute, carousel chart, custom attribute, tab bar, input box event, check operation, accessor syntax

Flink checkpoint源码理解
![[NCTF2019]SQLi](/img/a9/e103ccbbbb7dcf5ed20eb2bada528f.png)
[NCTF2019]SQLi

Neo4j基础指南(安装,节点和关系数据导入,数据查询)

Two methods of automated testing XSS vulnerabilities using burpsuite

Checked status in El checkbox 2021-08-02
随机推荐
el-checkbox中的checked勾选状态问题 2021-08-02
Learn json.stringify again
[CISCN2019 华东南赛区]Double Secret
[HarekazeCTF2019]encode_and_encode
C # conversion of basic data types for entry
程序员必做50题
JSCORE day_01(6.30) RegExp 、 Function
公司给了IP地址如何使用(详细版)
2022.DAY600
2022.7.9DAY601
Operator overloading
深入理解Golang - 闭包
mermaid
6_ Gradient descent method
Only hard work, hard work and hard work are the only way out C - patient entity class
[4.6 detailed explanation of Chinese remainder theorem]
Leetcode 301 week
关于Thymeleaf的表达式
[2. TMUX operation]
MySql - 如何确定一个字段适合构建索引?