当前位置:网站首页>The yarn deployment mode depends on the pre upload settings

The yarn deployment mode depends on the pre upload settings

2022-06-22 04:59:00 PONY LEE

flink on yarn Upload the dependent packages required for running to the remote ( Such as :hdfs System ), This will make job submission very lightweight , Because of what is needed Flink jar And Applications jar Will be obtained by the specified remote , Instead of being sent to the cluster by the client .

  1. Application Mode on yarn
./bin/flink run-application \
-t yarn-application \
-Dyarn.application.name="flink-yarn-application" \
-Dtaskmanager.numberOfTaskSlots=5 \
-Djobmanager.memory.process.size=1024m \
-Dtaskmanager.memory.process.size=1024m \
-Drest.flamegraph.enabled=true \
-Dyarn.provided.lib.dirs="hdfs://bdptest/data/flink-1.15.0/lib;hdfs:///data/flink-1.15.0/plugins" \
hdfs:///data/flink-1.15.0/flink-demo01-1.0-SNAPSHOT-pony-shade.jar 

NOTE:

Flink The execution of an application consists of two phases :

  • pre-flight: stay main() Start after method call , structure job graph.
  • runtime: Once the user code calls execute() This stage will be triggered .

Only in application Pattern main Function depends on jar package (flink-dist-jar) Can be put into the remote distributed file system , because application Mode main Method in jobmanager perform .
But for Session Patterns and Per-Job Pattern ,main Method is executed on the client .

  1. Per-Job Cluster Mode on yarn
./bin/flink run -t yarn-per-job --detached \
-Dyarn.application.name="flink-yarn-perjob" \
-Dyarn.provided.lib.dirs="hdfs://bdptest/data/flink-1.15.0/lib;hdfs:///data/flink-1.15.0/plugins" \
./examples/streaming/TopSpeedWindowing.jar
  1. Session Mode on yarn
./bin/yarn-session.sh --detached \
-Dyarn.application.name="flink-yarn-session" \
-Dtaskmanager.numberOfTaskSlots=5 \
-Dyarn.provided.lib.dirs="hdfs://bdptest/data/flink-1.15.0/lib;hdfs:///data/flink-1.15.0/plugins" \
-Drest.flamegraph.enabled=true 
./bin/flink run ./examples/streaming/TopSpeedWindowing.jar

【yarn.provided.lib.dirs】 Parameter interpretation

Specify the path where the remote dependent package is located , It can be multiple paths , Use semicolons to divide . The dependent packages under this path are uploaded in advance , And globally readable .
The operation of this mode makes flink Job submission becomes very light , Avoid uploading from the local client Flink rely on ( for example :Flink-dist、lib/、plugins/), To speed up the job submission process .
in addition ,YARN They are cached on the node , This eliminates the need to download dependencies for every application every time . This is also the community in flink-1.11 The significance of introducing a new deployment pattern in version .

NOTE:
If you specify yarn.provided.lib.dirs, There are the following precautions :

  • Need to put lib Bao He plugins For packet address ; Separate , You can also see from the above example , take plugins bag lib There may be package conflict errors in the directory
  • plugins The package path address must begin with plugins ending , For example, in the example above hdfs:///data/flink-1.15.0/plugins

Example :

hdfs://{namenode_address}/data/flink-1.15.0/lib;hdfs://{namenode_address}/data/flink-1.15.0/plugins;hdfs://{namenode_address}/data/flink-1.15.0/flink-dist

原网站

版权声明
本文为[PONY LEE]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/173/202206220449286642.html