当前位置:网站首页>Flink's fault tolerance mechanism (checkpoint)
Flink's fault tolerance mechanism (checkpoint)
2022-07-27 00:59:00 【A photographer who can't play is not a good programmer】
Flink Reliability cornerstone of -Checkpoint Fault tolerance mechanism
1. summary
flink Of checkpoint The mechanism can guarantee Flink When an operator fails for some reason in the whole cluster , It can restore the state of the whole application flow graph to a certain state before the failure , Ensure the state consistency of the application flow graph .Flink Of Checkpoint Mechanism principle from “Chandy-Lamport algorithm” Algorithm .
2. principle
Every need Checkpoint At startup ,Flink Of JobManager Create a CheckpointCoordinator( Checkpoint Coordinator ),CheckpointCoordination Take full responsibility for the snapshot of this application .
The process :

(1)CheckpointCoordinator( Checkpoint Coordinator ) Periodically apply all source Operator sending barrier( barrier ).
(2) When a source Operator receives a barrier when , Data processing is suspended , Then make a snapshot of your current state , And save to the specified persistent storage , Finally to CheckpointCoordinator Report on your snapshot production , At the same time broadcast to all its downstream operators barrier, Recovery data processing
(3) Downstream operator received barrier after , Will pause its own data processing , Then make a snapshot of the relevant state , And save to the specified persistent storage , Finally to CheckpointCoordinator Report on your own snapshot , At the same time broadcast to all its downstream operators barrier, Recovery data processing .
(4) Each operator follows the steps 3 Continuous snapshot and broadcast to downstream , Until the last barrier Pass on to sink operator , Snapshot completed .
(5) When CheckpointCoordinator After receiving the report of all operators , Consider that the snapshot of this cycle is made successfully ; otherwise , If you don't receive all operator reports within the specified time , The snapshot of this cycle is considered to be failed .
3.Flink Of Checkpoint And Spark Compared with ,Flink Is there any difference or advantage ?
Spark Streaming Of Checkpoint Only for Driver The recovery of data and metadata is done Checkpoint. and Flink Of Checkpoint The mechanism is much more complicated , It uses lightweight distributed snapshots , The snapshot of each operator is implemented , And a snapshot of the data in the flow .
边栏推荐
- redis——缓存雪崩、缓存穿透、缓存击穿
- 数据仓库知识点
- 通过FlinkCDC将MySQL中变更的数据写入到kafka(DataStream方式)
- Detailed explanation of this point in JS
- DOM day_ 03 (7.11) event bubbling mechanism, event delegation, to-do items, block default events, mouse coordinates, page scrolling events, create DOM elements, DOM encapsulation operations
- MySQL split table DDL operation (stored procedure)
- 深入理解Golang - 闭包
- [红明谷CTF 2021]write_shell
- 10 - CentOS 7 上部署MySql
- Flink 1.15 implements SQL script to recover data from savepointh
猜你喜欢

JSCORE day_04(7.5)
![[b01lers2020]Welcome to Earth](/img/e7/c8c0427b95022fbdf7bf2128c469c0.png)
[b01lers2020]Welcome to Earth

Flink Interval Join源码理解
![[Network Research Institute] attackers scan 1.6 million WordPress websites to find vulnerable plug-ins](/img/91/4d6e7d46599a67e3d7c73afb375abd.png)
[Network Research Institute] attackers scan 1.6 million WordPress websites to find vulnerable plug-ins
![[CISCN2019 华东南赛区]Double Secret](/img/51/9597968ff1747a67e10a70b785ee9f.png)
[CISCN2019 华东南赛区]Double Secret

redis——缓存雪崩、缓存穿透、缓存击穿

Essay - I say you are so cute
![[问题]yum资源被占用怎么办](/img/8d/50129fa1b1ef0aa0e968e6e6f20969.png)
[问题]yum资源被占用怎么办

Consistency inspection and evaluation method kappa

分区的使用及案例
随机推荐
DOM day_ 04 (7.12) BOM, open new page (delayed opening), address bar operation, browser information reading, historical operation
深入理解Golang - 闭包
当事务遇上分布式锁
Promise basic usage 20211130
SparkSql之编程方式
哪个证券公司开户股票佣金低,哪个股票开户安全
2022.DAY599
Point to plane projection
[NCTF2019]SQLi
The difference between golang slice make and new
MySql - 如何确定一个字段适合构建索引?
分区的使用及案例
[ciscn2019 finals Day2 web1]easyweb
基于Flink实时项目:用户行为分析(二:实时流量统计)
flink需求之—ProcessFunction(需求:如果30秒内温度连续上升就报警)
[watevrCTF-2019]Cookie Store
05 - 钓鱼网站的攻击与防御
[NPUCTF2020]ezinclude
Elaborate on the differences and usage of call, apply and bind 20211031
10 Web APIs