当前位置:网站首页>Detailed explanation of Flink parallelism and slot
Detailed explanation of Flink parallelism and slot
2022-07-07 19:54:00 【A sharp fire rages to the sky】
Flink Parallelism and Slot Detailed explanation
Address :
https://blog.csdn.net/zuodaoyong/article/details/106178488?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&utm_relevant_index=2
One 、 Concept
1、Task: There are multiple with the same function in one stage subTask Set , similar Spark Of TaskSet
2、SubTask: It is the smallest execution unit of the task , It's a Java Class , Complete the specific calculation logic
3、Slot: Isolation unit of computing resources , One Slot Can run multiple SubTask, But these SubTask It has to be from the same application At different stages of subTask.
Be careful :Flink Divide Task There are four main situations :
(1) similar keyBy,broadcast,rebalance Wait for the operator to generate shuffer
(2)Parallelism( Parallelism ) change
(3)new chain, That is, execute on the operator startNewChain() after , This operator is separate from the operator previously executed .
(4)disableChaining, Execute on the operator disableChaining(), That is, the beginning to the end of the operator , Generate a single task. Use scenarios , For example, the logic of this operator is complex , Let the operator use one alone task Internal SubTask.
Two 、slot
Flink Every one of them TaskManager It's all one JVM process , It may be in slot Execute one or more subTask.
slot The quantity is usually the same as each TaskManager Node availability CPU The number of cores is proportional . commonly Slot The number is per node CPU Number of cores .
Slot The number of is determined by flink-conf.yml In profile taskmanager.numberOfTaskSlots Set up .
Be careful : The same slot Cannot execute the same task The multiple subTask.
Expand :slotSharingGroup(String slotSharingGroup) Sharing slot
hypothesis flink Cluster has 3 Nodes , One jobManager,2 individual TaskManager. Every TaskManager Yes 2 individual Slot. That is, the cluster is a total of 4 individual slot
Task assignment slot The default name is default.
take wordcount For example , Run the application,5 individual task,14 individual subTask Are running on shared slot be known as "default" On .
If flatMap On the call slotSharingGroup(“slot_name”), be flatMap Put it under the name slot_name Of slot On .
Posterior operator map,keyBy,print All will be assigned slot The name is slot_name Up operation .
that , Abnormal conditions have occurred , In the cluster 4 individual slot, There is one default Of slot Running in is source,flatMap The parallelism of is 4, Need to assign to 4 It's called slot_name Of slot On . however slot_name Only 3 individual . Resulting in insufficient resources , Task deployment failed .
The solution is to adjust the parallelism to 3, Or cancel slotSharingGroup Set up .
summary :
(1)Flink The default name of the task resource slot of is default
(2) Call slotSharingGroup Set the slot where the operator runs
(3) If you change the name of the shared slot , The following operator does not set the name of the shared slot , Then it is consistent with the slot name changed last time
(4) Slot names are different subTask Cannot execute in the same slot
3、 ... and 、 Parallelism
1、 Setting of parallelism
(1)Operator Level( Operator level )
(2)Execution Environment Level( At the environmental level )
(3)Client Level( Client level )
(4)System Level( System level , That is, configure )
Parallelism sets priority :Operator Level > Execution Environment Level > Client Level > System Level
边栏推荐
- 使用高斯Redis实现二级索引
- 开源OA开发平台:合同管理使用手册
- Mysql, sqlserver Oracle database connection mode
- R language ggplot2 visualization: use the ggdensity function of ggpubr package to visualize the packet density graph, and use stat_ overlay_ normal_ The density function superimposes the positive dist
- 编译器优化那些事儿(4):归纳变量
- Kirin Xin'an cloud platform is newly upgraded!
- Dynamic addition of El upload upload component; El upload dynamically uploads files; El upload distinguishes which component uploads the file.
- String - string (Lua)
- Introduction to bit operation
- 8 CAS
猜你喜欢
模拟实现string类
Research and practice of super-resolution technology in the field of real-time audio and video
Numpy——axis
Install mysql8 for Linux X ultra detailed graphic tutorial
论文解读(ValidUtil)《Rethinking the Setting of Semi-supervised Learning on Graphs》
Tips and tricks of image segmentation summarized from 39 Kabul competitions
RESTAPI 版本控制策略【eolink 翻译】
PMP對工作有益嗎?怎麼選擇靠譜平臺讓備考更省心省力!!!
微信公众号OAuth2.0授权登录并显示用户信息
ASP. Net kindergarten chain management system source code
随机推荐
openEuler 有奖捉虫活动,来参与一下?
关于ssh登录时卡顿30s左右的问题调试处理
Matplotlib drawing 3D graphics
R语言ggplot2可视化:使用ggpubr包的ggqqplot函数可视化QQ图(Quantile-Quantile plot)
UCloud是基础云计算服务提供商
Responsibility chain model - unity
el-upload上传组件的动态添加;el-upload动态上传文件;el-upload区分文件是哪个组件上传的。
Chief technology officer of Pasqual: analog quantum computing takes the lead in bringing quantum advantages to industry
String - string (Lua)
Le PGR est - il utile au travail? Comment choisir une plate - forme fiable pour économiser le cœur et la main - d'œuvre lors de la préparation de l'examen!!!
Interpretation of transpose convolution theory (input-output size analysis)
IP tools
9 atomic operation class 18 Rohan enhancement
CMD command enters MySQL times service name or command error (fool teaching)
2022.07.05
R语言使用ggplot2函数可视化需要构建泊松回归模型的计数目标变量的直方图分布并分析构建泊松回归模型的可行性
openEuler 资源利用率提升之道 01:概论
IP 工具类
论文解读(ValidUtil)《Rethinking the Setting of Semi-supervised Learning on Graphs》
AD域组策略管理