当前位置:网站首页>Explanation of spark common parameters
Explanation of spark common parameters
2022-06-11 02:34:00 【hzp666】
Spark The default configuration file for is located in this location on the fortress machine : $SPARK_CONF_DIR/spark-defaults.conf, Users can view and understand .
It should be noted that , The default value has the lowest priority , If the user explicitly specifies the configuration when submitting a task or in the code , The user configuration shall prevail . On the basis that users understand the meaning of parameters , Parameters can be adjusted according to specific task conditions ( Modify submission parameters --conf value , No spark-defaults.conf file ).
The following common parameter configurations can be configured through --conf XXX=Y Way to use , For other parameters and descriptions, please refer to Configuration - Spark 3.2.1 Documentation
Parameter name | recommended value | explain |
|---|---|---|
| spark.master | yarn | Which resource scheduler to use , In general use yarn. Local debugging can use local |
| spark.submit.deployMode | cluster | driver Where the program runs , Debugging can use client, Online task suggestions cluster. |
| spark.driver.cores | 4 | driver Maximum use cpu( Threads ) Count |
| spark.driver.memory | 4-10g | driver Request memory size |
| spark.executor.memory | 3. Spark Task tuning techniques | Single executor Request heap memory size |
| spark.python.worker.memory | spark.executor.memory/2 | Generally, the default value is used |
| spark.yarn.executor.memoryOverhead | 3072 | Single executor Request out of heap memory size , Generally, the default value is used |
| spark.executor.cores | 3. Spark Task tuning techniques | Single executor Maximum concurrency task Count |
| spark.executor.instances | 3. Spark Task tuning techniques | executor Count |
| spark.speculation | The default value is false | The speculative execution mechanism defaults to false( close ), If the operation gets stuck occasionally, you can try to open . |
| spark.default.parallelism | 3. Spark Task tuning techniques | Control default RDD Of partithion Count , Read hdfs When you file partition Count blocksize And whether to merge the input . |
| spark.sql.shuffle.partitions | perform sql or sql Class operator shuffle Partition number , This value should be increased when the amount of data is large . | |
| spark.pyspark.python | python2/python3/python3.5 | Appoint pyspark The use of python edition ( If you use docker Mirror image , Please confirm whether there is a corresponding version in the image , The platform basic image only has python2) |
| spark.log.level | The default value is info | ALL, TRACE, DEBUG, INFO, WARN, ERROR, FATAL, OFF, Case insensitive . |
| spark.sql.hive.mergeFiles | The default value is false | Opening will automatically close spark-sql Small files generated |
| spark.hadoop.jd.bdp.streaming.monitor.enable | The default value is false | Open or not streaming Homework batch Backlog alarm function , The default is false, It can be done by --conf spark.hadoop.jd.bdp.streaming.monitor.enable=true Turn on |
| spark.hadoop.jd.bdp.batch.threshold | The default value is 10 | streaming Homework batch Backlog alarm threshold , The default value is 10, Users can adjust according to their needs , for example : --conf spark.hadoop.jd.bdp.batch.threshold=20 |
| spark.hadoop.jd.bdp.user.define.erps | The alarm group configured by the platform is used by default | For similar streaming Homework batch Backlog and other indicators that only need users' attention , The user can customize the alarm group , for example : --conf spark.hadoop.jd.bdp.user.define.erps="baibing12|maruilei" ( Be careful : Multi person configurable , adjacent erp Use vertical line | Separate ) |
spark.isLoadHivercFile spark.sql.tempudf.ignoreIfExists | Default false | Whether to load all hive udf( Only support spark-sql Next use , I won't support it spark-submit、pyspark).(HiveTask The inside has been opened , The user doesn't need extra settings ) |
边栏推荐
- SQL | return customer name, relevant order number and total price of each order
- Cyclodextrin metal organic framework( β- Cd-mof) loaded with dimercaptosuccinic acid / emodin / quercetin / sucralose / diflunisal / omeprazole (OME)
- Xampp is used under M1 chip, and the installation extension error
- P4338 [ZJOI2018]历史(树剖)(暴力)
- Kotlin let method
- 889. construct binary tree according to preorder and postorder traversal
- 软件测试是否需要掌握编程能力
- Project - redis message queue + worker thread fetches user operation logs and stores them (2)
- 【无标题】
- 当逻辑删除遇上唯一索引,遇到的问题和解决方案?
猜你喜欢

A collection of common ADB commands for app testing
![[C language] storage of data in memory -1 plastic](/img/4a/24c1bb4743bd4ae965ed88f333f2fe.jpg)
[C language] storage of data in memory -1 plastic

421. 数组中两个数的最大异或值

Ortele has obtained three rounds of financing nine months after its establishment, and hard discount stores have found new ways to grow?

金属有机框架MOF-Al(DIBA),MOF-Zr(DIBA),MOF-Fe(DIBA)包载姜黄素/羧苄西林/MTX甲氨蝶呤/紫杉醇PTX/阿霉素DOX/顺铂CDDP/CPT喜树碱等药物

Metal organic framework materials (fe-mil-53, mg-mof-74, ti-kumof-1, fe-mil-100, fe-mil-101) supported on isoflurane / methotrexate / doxorubicin (DOX) / paclitaxel / ibuprofen / camptothecin

Multilevel mesoporous organometallic framework material zif-8 loaded with lactic acid oxidase (LOD) / ferric oxide (Fe304) / doxorubicin / insulin /cas9 protein / metronidazole / emodin methyl ether

421. maximum XOR value of two numbers in the array

扁平数据转tree与tree数据扁平化

Technology sharing | quick intercom, global intercom
随机推荐
What can the enterprise exhibition hall design bring to the enterprise?
SQL | return customer name, relevant order number and total price of each order
Jetpack compose scaffold and bottomappbar (bottom navigation)
Everything实现快速搜索的原理
1031. 两个非重叠子数组的最大和
常见漏洞的防御措施整理
[3.delphi common components] 5 List class component
MOFs, metal organic framework materials of folic acid ligands, are loaded with small molecule drugs such as 5-fluorouracil, sidabelamine, taxol, doxorubicin, daunorubicin, ibuprofen, camptothecin, cur
Find - (block find)
【AI周报】AI与冷冻电镜揭示「原子级」NPC结构;清华、商汤提出「SIM」方法兼顾语义对齐与空间分辨能力
app 测试 常用 adb 命令集合
92. actual combat of completable future
心态不能崩......
Rewrite: kms activates office2016, 2019 and 2021 with error code: 0xc004f069
2022 high altitude installation, maintenance and removal of simulated examination platform of theoretical question bank
企业展厅设计能为企业带来什么?
环糊精金属有机骨架(β-CD-MOF)装载二巯丁二酸/大黄素/槲皮素/三氯蔗糖/二氟尼柳/奥美拉唑(OME)
Jetpack Compose Scaffold和TopAppBar(顶部导航)
Epoll 反应堆模型核心原理及代码讲解
动态给对象添加属性