当前位置:网站首页>Detailed explanation of Flink parallelism and slot

Detailed explanation of Flink parallelism and slot

2022-07-07 19:54:00 A sharp fire rages to the sky

Flink Parallelism and Slot Detailed explanation

Address :

https://blog.csdn.net/zuodaoyong/article/details/106178488?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&utm_relevant_index=2

One 、 Concept

1、Task: There are multiple with the same function in one stage subTask Set , similar Spark Of TaskSet

2、SubTask: It is the smallest execution unit of the task , It's a Java Class , Complete the specific calculation logic

3、Slot: Isolation unit of computing resources , One Slot Can run multiple SubTask, But these SubTask It has to be from the same application At different stages of subTask.

Be careful :Flink Divide Task There are four main situations :

(1) similar keyBy,broadcast,rebalance Wait for the operator to generate shuffer

(2)Parallelism( Parallelism ) change

(3)new chain, That is, execute on the operator startNewChain() after , This operator is separate from the operator previously executed .

(4)disableChaining, Execute on the operator disableChaining(), That is, the beginning to the end of the operator , Generate a single task. Use scenarios , For example, the logic of this operator is complex , Let the operator use one alone task Internal SubTask.

Two 、slot

Flink Every one of them TaskManager It's all one JVM process , It may be in slot Execute one or more subTask.

slot The quantity is usually the same as each TaskManager Node availability CPU The number of cores is proportional . commonly Slot The number is per node CPU Number of cores .

Slot The number of is determined by flink-conf.yml In profile taskmanager.numberOfTaskSlots Set up .

Be careful : The same slot Cannot execute the same task The multiple subTask.

Expand :slotSharingGroup(String slotSharingGroup) Sharing slot

hypothesis flink Cluster has 3 Nodes , One jobManager,2 individual TaskManager. Every TaskManager Yes 2 individual Slot. That is, the cluster is a total of 4 individual slot

Task assignment slot The default name is default.

img

take wordcount For example , Run the application,5 individual task,14 individual subTask Are running on shared slot be known as "default" On .

If flatMap On the call slotSharingGroup(“slot_name”), be flatMap Put it under the name slot_name Of slot On .

Posterior operator map,keyBy,print All will be assigned slot The name is slot_name Up operation .

that , Abnormal conditions have occurred , In the cluster 4 individual slot, There is one default Of slot Running in is source,flatMap The parallelism of is 4, Need to assign to 4 It's called slot_name Of slot On . however slot_name Only 3 individual . Resulting in insufficient resources , Task deployment failed .

The solution is to adjust the parallelism to 3, Or cancel slotSharingGroup Set up .

summary :

(1)Flink The default name of the task resource slot of is default

(2) Call slotSharingGroup Set the slot where the operator runs

(3) If you change the name of the shared slot , The following operator does not set the name of the shared slot , Then it is consistent with the slot name changed last time

(4) Slot names are different subTask Cannot execute in the same slot

3、 ... and 、 Parallelism

1、 Setting of parallelism

(1)Operator Level( Operator level )

(2)Execution Environment Level( At the environmental level )

(3)Client Level( Client level )

(4)System Level( System level , That is, configure )

Parallelism sets priority :Operator Level > Execution Environment Level > Client Level > System Level

原网站

版权声明
本文为[A sharp fire rages to the sky]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/188/202207071737109547.html