当前位置:网站首页>Spark 内存管理机制 新版
Spark 内存管理机制 新版
2022-07-25 15:10:00 【南风知我意丿】
Spark Memory Management mechanism
一、内存参数
| Property Name | Default |
|---|---|
| spark.memory.fraction | 0.6 |
| spark.memory.storageFraction | 0.5 |
| RESERVED_SYSTEM_MEMORY_BYTES | 300M |
1.简图:

2.示例:
Calculate the Memory for 5GB executor memory:
To calculate Reserved memory, User memory, Spark memory, Storage memory, and Execution memory, we will use the following parameters:
spark.executor.memory=5g
spark.memory.fraction=0.6
spark.memory.storageFraction=0.5
Java Heap Memory = 5 GB
= 5 * 1024 MB
= 5120 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory — Reserved Memory)
= 5120 MB - 300 MB
= 4820 MB
User Memory = Usable Memory * (1.0 — spark.memory.fraction)
= 4820 MB * (1.0 - 0.6)
= 4820 MB * 0.4
= 1928 MB
Spark Memory = Usable Memory * spark.memory.fraction
= 4820 MB * 0.6
= 2892 MB
Spark Storage Memory = Spark Memory * spark.memory.storageFraction
= 2892 MB * 0.5
= 1446 MB
Spark Execution Memory = Spark Memory * (1.0 - spark.memory.storageFraction)
= 2892 MB * ( 1 - 0.5)
= 2892 MB * 0.5
= 1446 MB

Reserved Memory — 300 MB — 5.85%
User Memory — 1928 MB — 37.65%
Spark Memory — 2892 MB — 56.48%
二、Spark 内存分配在Spark UI的表现
1.Spark UI with On Heap
- 1.提交参数
spark-shell \
--driver-memory 5g \
--executor-memory 5g
- 2.Spark UI表现

此时可以看到 StorageMemory只有 2.7GB,下面我们算一下这个数据是怎么来的
- 3.Storage Memory计算
Java Heap Memory = 5 GB
Reserved Memory = 300 MB
Usable Memory = 4820 MB
User Memory = 1928 MB
Spark Memory = 2892 MB = 2.8242 GB
Spark Storage Memory = 1446 MB = 1.4121 GB
Spark Execution Memory = 1446 MB = 1.4121 GB
从spark UI我们得知, Storage Memory value 是 2.7 GB ,但是我们计算的 the Storage Memory 是 1.4121 GB. 由此可知 Spark UI Storage Memory = Storage Memory + Execution Memory.
Storage Memory = Spark Storage Memory + Spark Execution Memory
= 1.4121 GB + 1.4121 GB
= 2.8242 GB
Spark UI Storage Memory (2.7 GB) 但是, 我们计算的 Storage Memory (2.8242 GB) 。这是因为我们设置的 --executor-memory 5g. 然而spark运行得到的最大的堆内存还要减去300MB。 so Java Heap Memory is only 4772593664 bytes.
Java Heap Memory = 4772593664 bytes = 4772593664/(1024 * 1024) = 4551 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (4551 - 300) MB = 4251 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 1700.4 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 2550.6 MB
Spark Storage Memory = 1275.3 MB
Spark Execution Memory = 1275.3 MB
Spark Memory (2550.6 MB/2.4908 GB) 依然不匹配 Spark UI (2.7 GB)这是因为我们转换 Java Heap Memory 字节变为MB 用的是 1024 * 1024 ,但是 Spark UI 转换 bytes 变 MB 除以的是 1000 * 1000.
Java Heap Memory = 4772593664 bytes = 4772593664/(1000 * 1000) = 4772.593664 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (4472.593664 - 300) MB = 4472.593664 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 1789.0374656 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 2683.5561984 MB = ~ 2.7 GB
Spark Storage Memory = 1341.7780992 MB
Spark Execution Memory = 1341.7780992 MB
至此,Spark Memory = (Usable Memory * spark.memory.fraction) = 2683.5561984 MB = ~ 2.7 GB,这样就和Spark UI一致了
- 4.不同版本bytes 变 MB 转换规则
- 4.1 spark 2.x
function formatBytes(bytes, type) {
if (type !== 'display') return bytes;
if (bytes == 0) return '0.0 B';
var k = 1000;
var dm = 1;
var sizes = ['B', 'KB', 'MB', 'GB', 'TB', 'PB', 'EB', 'ZB', 'YB'];
var i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
- 4.2 spark 3.x
function formatBytes(bytes, type) {
if (type !== 'display') return bytes;
if (bytes <= 0) return '0.0 B';
var k = 1024;
var dm = 1;
var sizes = ['B', 'KiB', 'MiB', 'GiB', 'TiB', 'PiB', 'EiB', 'ZiB', 'YiB'];
var i = Math.floor(Math.log(bytes) / Math.log(k));
return parseFloat((bytes / Math.pow(k, i)).toFixed(dm)) + ' ' + sizes[i];
}
2.Spark UI with OffHeap Enabled
- 1.提交参数
spark-shell \
--driver-memory 1g \
--executor-memory 1g \
--conf spark.memory.offHeap.enabled=true \
--conf spark.memory.offHeap.size=5g
- 2.Spark UI表现

- 3.Storage Memory计算
Storage Memory = On Heap Memory + Off Heap Memory
- 3.1.On Heap Memory
Java Heap Memory = 954728448 bytes = 954728448/1000/1000 = 954 MB
Reserved Memory = 300 MB
Usable Memory = (Java Heap Memory - Reserved Memory) = (954 - 300) MB = 654 MB
User Memory = (Usable Memory * (1 -spark.memory.fraction)) = 261.6 MB
Spark Memory = (Usable Memory * spark.memory.fraction) = 392.4 MB
Spark Storage Memory = 196.2 MB
Spark Execution Memory = 196.2 MB
- 3.2.Off Heap Memory
spark.memory.offHeap.size = 5 GB = 5 * 1000 MB = 5000 MB
- 3.3 Storage Memory
Storage Memory = On Heap Memory + Off Heap Memory
= 392.4 MB + 5000 MB
= 5392.4 MB
= 5.4 GB
3.Spark Storage Memory 计算程序demo
// JVM Arguments: -Xmx5g
public class SparkMemoryCalculation {
private static final long MB = 1024 * 1024;
private static final long RESERVED_SYSTEM_MEMORY_BYTES = 300 * MB;
private static final double SparkMemoryStorageFraction = 0.5;
private static final double SparkMemoryFraction = 0.6;
public static void main(String[] args) {
long systemMemory = Runtime.getRuntime().maxMemory();
long usableMemory = systemMemory - RESERVED_SYSTEM_MEMORY_BYTES;
long sparkMemory = convertDoubletLong(usableMemory * SparkMemoryFraction);
long userMemory = convertDoubletLong(usableMemory * (1 - SparkMemoryFraction));
long storageMemory = convertDoubletLong(sparkMemory * SparkMemoryStorageFraction);
long executionMemory = convertDoubletLong(sparkMemory * (1 - SparkMemoryStorageFraction));
printMemoryInMB("Heap Memory\t\t", systemMemory);
printMemoryInMB("Reserved Memory", RESERVED_SYSTEM_MEMORY_BYTES);
printMemoryInMB("Usable Memory\t", usableMemory);
printMemoryInMB("User Memory\t\t", userMemory);
printMemoryInMB("Spark Memory\t", sparkMemory);
printMemoryInMB("Storage Memory\t", storageMemory);
printMemoryInMB("Execution Memory", executionMemory);
System.out.println();
printStorageMemoryInMB("Spark Storage Memory", sparkMemory);
printStorageMemoryInMB("Storage Memory UI \t", storageMemory);
printStorageMemoryInMB("Execution Memory UI", executionMemory);
}
private static void printMemoryInMB(String type, long memory) {
System.out.println(type + " \t=\t"+ (memory/MB) +" MB");
}
private static void printStorageMemoryInMB(String type, long memory) {
System.out.println(type + " \t=\t"+ (memory/(1000*1000)) +" MB");
}
private static Long convertDoubletLong(double val) {
return new Double(val).longValue();
}
}
总结
参考:
https://community.cloudera.com/t5/Community-Articles/Spark-Memory-Management/ta-p/317794
大佬文章写的宛如艺术品
最后,送大家一句话“知识,哪怕是知识的幻影,也会成为你的铠甲,保护你不被愚昧反噬”(来自知乎——《为什么读书?》)
边栏推荐
猜你喜欢

MySQL之事务与MVCC

Share a department design method that avoids recursion

延迟加载源码剖析:

Award winning interaction | 7.19 database upgrade plan practical Summit: industry leaders gather, why do they come?

I2C device driver hierarchy

27 选择器的分类

Sudo rosdep init error ROS installation problem solution

【微信小程序】小程序宿主环境详解

37 element mode (inline element, block element, inline block element)

反射-笔记
随机推荐
我的创作纪念日
"Ask every day" briefly talk about JMM / talk about your understanding of JMM
LeetCode_ Factorization_ Simple_ 263. Ugly number
Solve the error caused by too large file when uploading file by asp.net
流程控制(上)
pkg_resources动态加载插件
Melody + realsense d435i configuration and error resolution
Raft of distributed consistency protocol
Splice a field of the list set into a single string
String type time comparison method with error string.compareto
Overview of cloud security technology development
AS查看依赖关系和排除依赖关系的办法
6线SPI传输模式探索
SPI传输出现数据与时钟不匹配延后问题分析与解决
如何解决Visual Studio中scanf编译报错的问题
37 element mode (inline element, block element, inline block element)
45padding won't open the box
推荐10个堪称神器的学习网站
iframe嵌套其它网站页面 全屏设置
[C题目]牛客 链表中倒数第k个结点