当前位置:网站首页>Principle analysis of spark
Principle analysis of spark
2022-07-02 07:07:00 【Boring n day】
Spark Principle analysis of
List of articles
Preface
Today's main learning is a Spark Analysis of the principle of the framework ,spark Operation flow ,RDD An execution process of , An introduction to dependencies
One . Spark brief introduction
Spark By scala Developed ,scala To run on JAVA platform (JVM), And compatible with existing JAVA Program , So use scala The program is written by Java jdk You can run , It doesn't need to scala jdk
Spark And MapReduce contrast

As can be seen from the figure above , Use Hadoop MR Iterative computation is very resource intensive
Spark After loading the data into memory , The subsequent iterative calculation can directly use the intermediate results in memory for operation , Avoid frequently reading data from disk
Two . Basic concept and architecture design

Spark The basic process of operation ( Here we use YARN For example )

- When the client submits the application , First, build a basic running environment for applications SparkContext And to RM Register and apply for resources , by Driver Create a SC Apply for resources , Assignment and monitoring of tasks
there Driver Understand applications written for users ,SparkContex(SC)t Is similar to RM Medium AM function - RM After receiving the request, it will start Executor And allocate resources , And to SC Register and apply for Task, And always with SC Maintain communication to prevent disconnection .
- After the job runs SC towards RM Apply for cancellation and close yourself
RDD A basic operational overview of
RDD The typical execution process of is as follows

1.RDD Read in the external data source and create , If the data source is large, multiple partitions will be created , Different partitions will go to different data nodes , Because of this characteristic , Talent RDD: Distributed elastic datasets
2.RDD After a series of conversion operations : Each conversion operation will form a new RDD For the next conversion operation , In this way, it forms DAG chart
3. the last one RDD Output to external data source through action operation
In the process ,RDD Will convert , But it will not generate specific results , Only encounter action operation (action) Will calculate the corresponding results

RDD Dependency of
As shown in the figure above ,RDD There are wide dependence and narrow dependence , What has narrow dependence , Now let me talk about Kuan dependence , It shows that there is a father RDD One partition of the corresponding sub RDD Multiple sections of
summary
Today's writing is more water , In general, it is written to consolidate what we have learned today . I'll be more specific when I have time later
边栏推荐
- RMAN incremental recovery example (1) - without unbacked archive logs
- php中根据数字月份返回月份的英文缩写
- IDEA2020中测试PySpark的运行出错
- Implement strstr() II
- MapReduce与YARN原理解析
- ORACLE 11G利用 ORDS+pljson来实现json_table 效果
- ORACLE EBS 和 APEX 集成登录及原理分析
- Wechat applet Foundation
- The table component specifies the concatenation parallel method
- DNS攻击详解
猜你喜欢

Oracle EBS数据库监控-Zabbix+zabbix-agent2+orabbix

MySQL中的正则表达式

Cloud picture says | distributed transaction management DTM: the little helper behind "buy buy buy"

Sqli labs customs clearance summary-page3

Explanation and application of annotation and reflection

搭建frp进行内网穿透

Sqli labs customs clearance summary-page1

Sqli-labs customs clearance (less1)

在php的开发环境中如何调取WebService?

Solve the problem of bindchange event jitter of swiper component of wechat applet
随机推荐
UEditor . Net version arbitrary file upload vulnerability recurrence
Network security -- intrusion detection of emergency response
ts和js区别
Uniapp introduces local fonts
Ceaspectuss shipping company shipping artificial intelligence products, anytime, anywhere container inspection and reporting to achieve cloud yard, shipping company intelligent digital container contr
Sqli Labs clearance summary - page 2
2021-07-17C#/CAD二次开发创建圆(5)
Sqli-labs customs clearance (less15-less17)
2021-07-19c CAD secondary development creates multiple line segments
RMAN incremental recovery example (1) - without unbacked archive logs
ORACLE EBS DATAGUARD 搭建
Win10: add or delete boot items, and add user-defined boot files to boot items
In depth study of JVM bottom layer (II): hotspot virtual machine object
Common prototype methods of JS array
Explain in detail the process of realizing Chinese text classification by CNN
2021-07-19C#CAD二次开发创建多线段
php中时间戳转换为毫秒以及格式化时间
Oracle段顾问、怎么处理行链接行迁移、降低高水位
DNS攻击详解
工具种草福利帖