当前位置:网站首页>Yarn performance tuning of CDH cluster
Yarn performance tuning of CDH cluster
2022-06-27 22:07:00 【javastart】
This article mainly discusses CDH Clustered YARN Tuning configuration , About YARN Tuning configuration for , Main concern CPU And memory tuning , among CPU Means physical CPU Multiply the number by CPU Check the number , namely Vcores = CPU Number *CPU Check the number .YARN In order to container The form of container encapsulates the resource ,task stay container Internal implementation .
Cluster configuration
The configuration of the cluster mainly includes three steps , The first is to plan the working hosts of the cluster and the configuration of each host , The second is to plan the installed components of each host and their resource allocation , The third is to plan the size of the cluster .
Configuration of working host
As shown in the following table : The memory of the host is 256G,4 individual 6 nucleus CPU,CPU Hyper threading support , The network bandwidth is 2G
| host Components | Number | size | A total of | describe |
| RAM | 256G | 256G | Memory size | |
| CPU | 4 | 6 | 48 | total CPU Check the number |
| HyperThreading CPU | YES | hyper-threading CPU, Make the operating system think that the number of cores of the processor is twice the actual number of cores 2 times , So if there is 24 A core processor , The operating system will think that the processor has 48 Core | ||
| The Internet | 2 | 1G | 2G | network bandwidth |
Work host installation component configuration
The first step is to define the memory and memory of each host CPU To configure , Next, allocate resources for the services of each node , Main distribution CPU And memory .
| service | Category | CPU Check the number | Memory (MB) | describe |
| operating system | Overhead | 1 | 8192 | Assign... To the operating system 1 nucleus 8G Memory , commonly 4~8G |
| Other services | Overhead | 0 | 0 | Not CDH colony 、 Resources occupied by non operating systems |
| Cloudera Manager agent | Overhead | 1 | 1024 | Distribute 1 nucleus 1G |
| HDFS DataNode | CDH | 1 | 1024 | Default 1 nucleus 1G |
| YARN NodeManager | CDH | 1 | 1024 | Default 1 nucleus 1G |
| Impala daemon | CDH | 0 | 0 | Optional services , The suggestion is at least impala demon Distribute 16G Memory |
| Hbase RegionServer | CDH | 0 | 0 | Optional services , Suggest 12~16G Memory |
| Solr Server | CDH | 0 | 0 | Optional services , The minimum 1G Memory |
| Kudu Server | CDH | 0 | 0 | Optional services ,kudu Tablet server The minimum 1G Memory |
| Available Container Resources | 44 | 250880 | The remainder is allocated to yarn Of container |
container Resource allocation Physical Cores to Vcores Multiplier: Every container Of cpu core The number of concurrent threads , This article is set to 1
YARN Available Vcores:YARN Usable CPU Check the number =Available Container Resources * Physical Cores to Vcores Multiplier, That is to say 44
YARN Available Memory:250880
The cluster size
Number of working nodes in the cluster :10
YARN To configure
YARN NodeManager Configuration properties
| Configuration parameters | value | describe |
| yarn.nodemanager.resource.cpu-vcores | 44 | yarn Of nodemanager Distribute 44 nucleus , The remaining of each node CPU |
| yarn.nodemanager.resource.memory-mb | 250800 | Allocated memory size , The remaining memory of each node |
verification YARN Configuration of
Sign in YARN Of resourcemanager Of WEBUI:http://<ResourceManagerIP>:8088/, verification 'Memory Total' And 'Vcores Total', If all nodes are normal , that Vcores Total Should be 440,Memory Should be 2450G, namely 250800/1024*10
YARN Of container To configure
YARN Of container Of Vcore To configure
| Configuration parameters | value | describe |
| yarn.scheduler.minimum-allocation-vcores | 1 | Assigned to container Minimum vcore Number |
| yarn.scheduler.maximum-allocation-vcores | 44 | Assigned to container Maximum vcore Count |
| yarn.scheduler.increment-allocation-vcores | 1 | Container virtualization CPU Kernel increment |
YARN Of container Memory configuration
| Configuration parameters | value | describe |
| yarn.scheduler.minimum-allocation-mb | 1024 | Assigned to container Minimum memory size , by 1G |
| yarn.scheduler.maximum-allocation-mb | 250880 | Assigned to container Maximum memory of , be equal to 245G, That is, the maximum memory remaining for each node |
| yarn.scheduler.increment-allocation-mb | 512 | Container memory increment , Default 512M |
Cluster resource allocation estimation
| describe | minimum value | Maximum |
| According to each container Minimum memory allocation , The largest cluster container The number of | 2450 | |
| According to each container Minimum Vcore Distribute , The largest cluster container The number of | 440 | |
| According to each container Maximum memory allocation , The smallest of the cluster container The number of | 10 | |
| According to each container Maximum Vcores Distribute , The smallest of the cluster container The number of | 10 |
container Reasonable configuration inspection
| Configuration constraints | describe |
| maximal Vcore Quantity must be greater than or equal to the allocated minimum Vcore Count | yarn.scheduler.maximum-allocation-vcores >= yarn.scheduler.minimum-allocation-vcores |
| The maximum memory allocated must be greater than or equal to the minimum memory allocated | yarn.scheduler.maximum-allocation-mb >= yarn.scheduler.minimum-allocation-mb |
| The minimum number of cores allocated must be greater than or equal to 0 | yarn.scheduler.minimum-allocation-vcores >= 0 |
| The biggest allocation Vcore Number must be greater than or equal to 1 | yarn.scheduler.maximum-allocation-vcores >= 1 |
| Each host is assigned to nodemanaer Of vcore The total must be greater than the minimum allocated vcore Count | yarn.nodemanager.resource.cpu-vcores >= yarn.scheduler.minimum-allocation-vcores |
| Each host is assigned to nodemanaer Of vcore The total must be greater than the maximum allocated vcore Count | yarn.nodemanager.resource.cpu-vcores >= yarn.scheduler.maximum-allocation-vcores |
| Each host is assigned to nodemanaer The memory of must be greater than the minimum memory allocated by the schedule | yarn.nodemanager.resource.memory-mb >= yarn.scheduler.maximum-allocation-mb |
| Each host is assigned to nodemanaer The memory of must be greater than the maximum memory allocated by the schedule | yarn.nodemanager.resource.memory-mb >= yarn.scheduler.minimum-allocation-mb |
| container Minimum configuration | If yarn.scheduler.minimum-allocation-mb Less than 1GB,container May be YARN Kill , Because there will be OutOfMemory Memory overflow |
MapReduce To configure
ApplicationMaster To configure
| Configuration parameters | take value | describe |
| yarn.app.mapreduce.am.resource.cpu-vcores | 1 | ApplicationMaster The virtual CPU Number of cores |
| yarn.app.mapreduce.am.resource.mb | 1024 | ApplicationMaster Physical memory requirements for (MiB) |
| yarn.app.mapreduce.am.command-opts | 800 | Pass on to MapReduce ApplicationMaster Of Java Command line arguments ,AM Java heap size , by 800M |
The ratio of heap to container size
| Configuration parameters | Value | describe |
| task Automatic heap size | yes | |
| mapreduce.job.heap.memory-mb.ratio | 0.8 | Map and Reduce The ratio of the heap size of the task to the container size . The heap size should be smaller than the container size , With permission JVM Some of the expenses of , The default is 0.8 |
map task To configure
| Configuration parameters | value | describe |
| mapreduce.map.cpu.vcores | 1 | Assigned to map task Of vcore Count |
| mapreduce.map.memory.mb | 1024 | Assigned to map task Of memory ,1G |
| mapreduce.task.io.sort.mb | 400 | I/O Sort memory buffer (MiB), Default 256M, Generally, there is no need to modify |
reduce task To configure
| Configuration parameters | value | describe |
| mapreduce.reduce.cpu.vcores | 1 | Assigned to reduce task Of vcore Count |
| mapreduce.reduce.memory.mb | 1024 | Assigned to reduce task Of memory ,1G |
MapReduce Configuration rationality check
Application Master Check the rationality of the configuration
yarn.scheduler.minimum-allocation-vcores <= yarn.app.mapreduce.am.resource.cpu-vcores<= yarn-scheduler.maximum-allocation-vcores
yarn.scheduler.minimum-allocation-mb <= yarn.app.mapreduce.am.resource.cpu-vcores <= yarn.scheduler.maximum-allocation-mb
Java Heap Size is container The size of 75%~90%: Too low will cause a waste of resources , Too high will cause OOMMap Task Check the rationality of the configuration
Reduce Task Check the rationality of the configuration
yarn.scheduler.minimum-allocation-vcores <= mapreduce.map.cpu.vcores <= yarn-scheduler.maximum-allocation-vcores
yarn.scheduler.minimum-allocation-mb <= mapreduce.map.memory.mb <= yarn.scheduler.maximum-allocation-mb
Spill/Sort Memory for each task Heap memory 40%~60%
Reduce Task Check the rationality of the configuration
yarn.scheduler.minimum-allocation-vcores <= mapreduce.reduce.cpu.vcores <= yarn-scheduler.maximum-allocation-vcores
yarn.scheduler.minimum-allocation-mb <= mapreduce.reduce.memory.mb <= yarn.scheduler.maximum-allocation-mb
YARN and MapReduce Summary of configuration parameters
| YARN/MapReduce Parameter configuration | describe |
| yarn.nodemanager.resource.cpu-vcores | Assigned to container The virtual cpu Count |
| yarn.nodemanager.resource.memory-mb | Assigned to container The memory size of |
| yarn.scheduler.minimum-allocation-vcores | Assigned to container The smallest virtual cpu Count |
| yarn.scheduler.maximum-allocation-vcores | Assigned to container Maximum virtual cpu Count |
| yarn.scheduler.increment-allocation-vcores | Assigned to container Incremental virtual cpu Count |
| yarn.scheduler.minimum-allocation-mb | Assigned to container Minimum memory size |
| yarn.scheduler.maximum-allocation-mb | Assigned to container Maximum memory of |
| yarn.scheduler.increment-allocation-mb | Assigned to container Incremental memory size |
| yarn.app.mapreduce.am.resource.cpu-vcores | ApplicationMaste The virtual cpu Count |
| yarn.app.mapreduce.am.resource.mb | ApplicationMaste The memory size of |
| mapreduce.map.cpu.vcores | map task The virtual CPU Count |
| mapreduce.map.memory.mb | map task The memory size of |
| mapreduce.reduce.cpu.vcores | reduce task The virtual cpu Count |
| mapreduce.reduce.memory.mb | reduce task The memory size of |
| mapreduce.task.io.sort.mb | I/O Sort memory size |
note: stay CDH5.5 Or later , Parameters mapreduce.map.java.opts, mapreduce.reduce.java.opts, yarn.app.mapreduce.am.command-opts Will be based on container The proportion of heap memory is automatically configured
边栏推荐
- GBase 8a V8版本节点替换期间通过并发数控制资源使用减少对系统影响的方法
- STM32CubeIDE1.9.0\STM32CubeMX 6.5 F429IGT6加LAN8720A,配置ETH+LWIP
- Quick excel export
- [LeetCode]186. 翻转字符串里的单词 II
- 百万年薪独家专访,开发人员不修复bug怎么办?
- C语言程序设计详细版 (学习笔记1) 看完不懂,我也没办法。
- Contest 2050 and Codeforces Round #718 (Div. 1 + Div. 2)
- [LeetCode]515. Find the maximum value in each tree row
- [leetcode] dynamic programming solution split integer i[silver fox]
- Process control task
猜你喜欢
随机推荐
VMware virtual machine PE startup
Gbase 8A OLAP analysis function cume_ Example of dist
Acwing weekly contest 57- digital operation - (thinking + decomposition of prime factor)
管理系统-ITclub(上)
[sword offer ii] sword finger offer II 029 Sorted circular linked list
How to design an elegant caching function
GBase 8a OLAP分析函数 cume_dist的使用样例
It smells good. Since I used Charles, Fiddler has been completely uninstalled by me
Common methods of string class
Array assignment
[LeetCode]100. 相同的树
Open source technology exchange - Introduction to Chengying, a one-stop fully automated operation and maintenance manager
AQS SOS AQS with me
Gbase 8A OLAP analysis function cume_ Example of dist
PCIe knowledge point -008: structure of PCIe switch
Luogu p5706 redistributing fertilizer and house water
QT base64 encryption and decryption
[LeetCode]515. Find the maximum value in each tree row
gomock mockgen : unknown embedded interface
洛谷P5706 再分肥宅水



![[MySQL] database function clearance Tutorial Part 2 (window function topic)](/img/03/2b37e63d0d482d5020b7421ac974cb.jpg)




