当前位置:网站首页>1 Introduction to spark Foundation
1 Introduction to spark Foundation
2022-07-03 22:12:00 【Ruthless coding machine】
1 Spark What is it?
2 spark Ten years of wind and rain
3 Extended data
4 Spark Four characteristics
5 Spark Framework module
1.6 Spark The mode of operation of
1.7 Spark The role Architecture
8 summary
1 Spark What problem to solve ?
Massive data computing , Offline batch processing and real-time stream calculation can be carried out
2 Spark What are the modules ?
The core SparkCore、SQL Calculation (SparkSQL)、 Flow calculation (SparkStreaming
)、 Figure calculation (GraphX)、 machine learning (MLlib)
3 Spark What are the characteristics ?
Fast 、 Easy to use 、 Strong commonality 、 Multiple modes of operation
4 Hadoop Process based computing and Spark Advantages and disadvantages of thread based approach ?
Hadoop Medium MR Each of them map/reduce task It's all one java Run in process mode , The advantage is that processes are independent of each other , Every task Exclusive process resources , no
There is mutual interference , Easy to monitor , But the problem is task Inconvenient to share data between , The execution efficiency is relatively low . Like multiple map task Reading different data source files requires adding
To each map task in , Cause repeated loading and waste of memory . Thread based computing is to share data and improve execution efficiency ,Spark The minimum execution of threads is adopted
Company , But the disadvantage is that there will be resource competition between threads
5 Spark The mode of operation of ?
Local mode
• Cluster pattern (StandAlone、YARN、K8S)
• Cloud model
6 Spark Operation role of ( contrast YARN)?
• Master: Cluster resource management ( similar ResourceManager)
• Worker: Stand alone resource management ( similar NodeManager)
• Driver: Single task manager ( similar ApplicationMaster)
• Executor: Single task performer ( similar YARN Inside the container Task)
Basic concepts of threads
Thread is CPU The basic dispatching unit of
A process usually contains multiple threads , Multiple threads under a process share the resources of the process
Threads between different processes are not visible to each other
Threads cannot execute independently
One thread can create and undo another thread
边栏推荐
- How to store null value on the disk of yyds dry inventory?
- How to obtain opensea data through opensea JS
- 股票炒股开户注册安全靠谱吗?有没有风险的?
- The 14th five year plan for the construction of Chinese Enterprise Universities and the feasibility study report on investment Ⓓ 2022 ~ 2028
- Functions and differences between static and Const
- JS notes (III)
- Why use pycharm to run the use case successfully but cannot exit?
- Leetcode problem solving - 230 The k-th smallest element in the binary search tree
- How to install sentinel console
- [dynamic programming] Jisuan Ke: Jumping stake (variant of the longest increasing subsequence)
猜你喜欢
Data consistency between redis and database
Introduction to kubernetes
Collection | pytoch common loss function disassembly
Team collaborative combat penetration tool CS artifact cobalt strike
Unique in China! Alibaba cloud container service enters the Forrester leader quadrant
Kali2021.4a build PWN environment
[golang] leetcode intermediate - alphabetic combination of island number and phone number
UC Berkeley proposes a multitask framework slip
The latest analysis of R1 quick opening pressure vessel operation in 2022 and the examination question bank of R1 quick opening pressure vessel operation
Redis concludes that the second pipeline publishes / subscribes to bloom filter redis as a database and caches RDB AOF redis configuration files
随机推荐
treevalue——Master Nested Data Like Tensor
The 14th five year plan for the construction of Chinese Enterprise Universities and the feasibility study report on investment Ⓓ 2022 ~ 2028
1068. Consolidation of ring stones (ring, interval DP)
Analysis report on the development trend and Prospect of global and Chinese supercontinuum laser source industry Ⓚ 2022 ~ 2027
Data consistency between redis and database
Redis concludes that the second pipeline publishes / subscribes to bloom filter redis as a database and caches RDB AOF redis configuration files
How to install sentinel console
Plug - in Oil Monkey
The 14th five year plan and investment feasibility study report of China's industry university research cooperation Ⓧ 2022 ~ 2028
Global and Chinese market of gallic acid 2022-2028: Research Report on technology, participants, trends, market size and share
国泰君安证券开户是安全可靠的么?怎么开国泰君安证券账户
Solve the problem that openocd fails to burn STM32 and cannot connect through SWD
[golang] leetcode intermediate - alphabetic combination of island number and phone number
Bluebridge cup Guoxin Changtian single chip microcomputer -- detailed explanation of schematic diagram (IV)
The latest analysis of crane driver (limited to bridge crane) in 2022 and the test questions and analysis of crane driver (limited to bridge crane)
js demo 计算本年度还剩下多少天
Global and Chinese market of AC induction motors 2022-2028: Research Report on technology, participants, trends, market size and share
Great gods, I want to send two broadcast streams: 1. Load basic data from MySQL and 2. Load changes in basic data from Kafka
DOM light switch case
JS demo calculate how many days are left in this year