当前位置:网站首页>Flink snapshot analysis: operators for locating large states and data skew
Flink snapshot analysis: operators for locating large states and data skew
2022-06-24 12:17:00 【KyleMeow】
Job status is getting bigger and bigger , What's going on ?
stay Flink In homework , Whether it's SQL still JAR Pattern , Often used directly or indirectly to state (State). When Flink When taking a snapshot , These user-defined status data can be saved in the status point , For subsequent crash recovery .
Flink The state of is divided into Operator State and Keyed State, and Keyed State Can be divided into ValueState、MapState、ListState、AggregatingState、MergingState、ReducingState Etc . Besides , These numerous states have a variety of specific implementations (HeapState、RocksDBState etc. ), State access also requires various Serializer and Deserializer Participation , The whole link is exquisite and complicated .
For ordinary users ,Flink The internal operation mode is like a black box , But the trouble brought by state is real , Especially in use SQL A lot of watches JOIN perhaps GROUP BY Equal semantic time , It's easy because there are more and more states , Cause frequent TaskManager OOM( Out of memory ), Affect the stability of online business , More affect the mood ╮(╯_╰)╭
Many users face jobs that continue to crash , And dozens of hundreds on disk GB The snapshot file for , I also collapsed : Such a big state , What's in it ? Can you delete something ?
Type of snapshot
Flink Snapshots of include Checkpoint( Cycle triggers ) and Savepoint( The user actively triggers ) Two kinds of , among Checkpoint Divided into ordinary Checkpoint And externalization (Externalized)Checkpoint. Ordinary Checkpoint Can only be used for this JobManager Internal recovery during survival ; Externalization Checkpoint and Savepoint It can be used for cold start recovery from scratch .
about Savepoint, And turned on Externalization characteristics Of Checkpoint,Flink A metadata file will be generated in the snapshot Directory ( The snapshot directory is named _metadata The file of ), This file is a crucial clue when we analyze the snapshot .
The storage format of the snapshot
Let's start with this metadata (_metadata file ) Starting with , Take a look at its data structure :
stay Master State In the indefinite length structure of , They have their own Magic Number、 Data length and other information , There is usually not much data .
Operator State It's the big end of the State , In its indefinite length structure , It mainly includes each Operator Of ID( By two Long Put it together to form ), And the parallelism of the current operator (parallelism) And maximum parallelism (maximum parallelism), And subtasks (subtask) The number of States 、 Of each subtask index、 Metadata ( Does it include raw and managed Of Operator State、 Does it include raw and managed Of Keyed State、 What specific states are included 、 KeyGroup Range 、 Offset 、 Whether it is Incremental state 、 Pointer to the status file RelativeFileStateHandle etc. ).
In addition to metadata files , There are also many specific status files (RelativeFileStateHandle The file pointed to by the pointer ), They are usually too large to be embedded directly _metadata file , A state that can only exist as a separate file .
How to read snapshots
As you can see from the above , Parsing state files is not easy , There are many things to consider . He who breaks the bell must tie the bell , We can use Flink Read and parse the state file by itself :
1. Flink Inside API
The easiest way , Is to find Flink Restore snapshot state of the source code , Then follow the diagram to find and deserialize _metadata File class . Soon , We found org.apache.flink.runtime.checkpoint.Checkpoints#loadCheckpointMetadata This static method , It can reverse sequence a given data stream into Flink Inside CheckpointMetadata object ( That is, the memory mapping of the above file ).
If you only want to process metadata information , It does not involve reading and writing specific status data , This method can be used .
2. After the encapsulation State Processor API
In the new Flink In the version , It also includes the encapsulated State Processor API, Through this API, We can not only read the specific status file , Status data can also be generated as needed for new applications Flink Homework uses .
Use State Processor API when , Because it involves the reading and writing of specific states , You need to give StateBackend example , And specific Operator UID Etc , And with DataSet Executed as a batch task , The process is relatively complex , This article will not expand the description , There will be a separate article on how to use it .
Practice together
Let's try to use Flink Inside API To read the status metadata information , And analyze what Operator The state of accounts for the largest proportion , And these Operator Each of them Subtask( Subtasks with multiple degrees of parallelism ) The state and dosage of .
Sample code It's simple , Here are the specific analysis results :
You can see , All the information in the metadata file is printed out , And it shows 4421bbc22ac32fa6abe810c70a869c54 This Operator The state of accounts for the largest proportion , Reached 92.31%, And each Subtask The state quantity of is relatively average , All in 1.1G ~ 1.3G Between , There is basically no phenomenon of data skew .
Because the metadata does not contain this Operator Information such as your name and type , You need to search this by looking up the log Operator ID. From the log, we can see that it is a InnerJoin The operator of .
Further detailed analysis of the source code can get , yes StreamingJoinOperator This flow JOIN Two of the operators JoinRecordStateView State data .
In principle , To implement the general of two streams JOIN( Borderless JOIN), All data of the two streams must be permanently retained for future reference , And there is no cleaning mechanism by default ( Unless you set the following Idle State Retention Time), So this kind of JOIN In the production environment, it is easy to happen because the state is too large OOM. We recommend that users use Interval JOIN( Time interval JOIN) Instead of , For details, please refer to This document .
The other one SQL The operators that are easy to cause super large states in the environment are unbounded GROUP BY, But also good Flink Provides Idle State Retention Time Mechanism , The periodic cleaning logic of the state can be configured , Will these GROUP BY and JOIN The expired status of should be cleared in time .
Reference reading
https://ververica.cn/developers/introduction-to-state-management-and-fault-tolerance/
https://ververica.cn/developers/state-management/
https://flink.apache.org/feature/2019/09/13/state-processor-api.html
https://ci.apache.org/projects/flink/flink-docs-release-1.12/zh/dev/table/streaming/joins.html
边栏推荐
- Istio best practice: graceful termination
- 深圳市人民医院程立新课题组提出多组学数据在肝细胞癌的诊断与预后分析的新方法meGPS
- 万名校园开发者花式玩AI,亮点看这张图就够啦!
- [206] use PHP language to generate the code of go language
- 我真傻,招了一堆只会“谷歌”的程序员!
- GLOG从入门到入门
- I'm in Shenzhen. Where can I open an account? Is it safe to open an account online now?
- 我在深圳,到哪里开户比较好?现在网上开户安全么?
- Linker --- linker
- 最新热点:使用铜死亡相关基因进行肿瘤预后分型!
猜你喜欢

Group planning - General Review

ArrayList#subList这四个坑,一不小心就中招

【Go语言刷题篇】Go从0到入门4:切片的高级用法、初级复习与Map入门学习

Basic path test of software test on the function of the previous day
![[live review] battle code pioneer phase 7: how third-party application developers contribute to open source](/img/fa/e52bd8a1a404a759ef6ba88e8da0f0.png)
[live review] battle code pioneer phase 7: how third-party application developers contribute to open source

Opencv learning notes - Discrete Fourier transform

Opencv learning notes - loading and saving images

程序员大部分时间不是写代码,而是。。。

《opencv学习笔记》-- 图像的载入和保存

AXI低功耗接口
随机推荐
Popular science of data annotation: ten common image annotation methods
[live review] battle code pioneer phase 7: how third-party application developers contribute to open source
链接器 --- Linker
[206] use PHP language to generate the code of go language
Install Kali on the U disk and persist it
How can I open an account with new bonds? Is it safe
打新债可以申请多少 开户是安全的吗
LS-DYNA beginner's experience
嵌入式必学!硬件资源接口详解——基于ARM AM335X开发板 (下)
Opencv learning notes - regions of interest (ROI) and image blending
软件测试 对前一日函数的基本路径测试
【Go语言刷题篇】Go从0到入门4:切片的高级用法、初级复习与Map入门学习
Istio best practice: graceful termination
Qt: judge whether the string is in numeric format
How to develop hospital information system (his) with SMS notification and voice function
怎样购买打新债 开户是安全的吗
11+文章-机器学习打造ProTICS框架-深度揭示了不同分子亚型中肿瘤浸润免疫细胞对预后的影响
Adobe Photoshop using the box selection tool for selection tutorial
TP-LINK 1208 router tutorial (2)
保险APP适老化服务评测分析2022第06期