当前位置:网站首页>1 Introduction to spark Foundation
1 Introduction to spark Foundation
2022-07-03 22:12:00 【Ruthless coding machine】
1 Spark What is it?



2 spark Ten years of wind and rain


3 Extended data


4 Spark Four characteristics





5 Spark Framework module

1.6 Spark The mode of operation of

1.7 Spark The role Architecture



8 summary
1 Spark What problem to solve ?
Massive data computing , Offline batch processing and real-time stream calculation can be carried out
2 Spark What are the modules ?
The core SparkCore、SQL Calculation (SparkSQL)、 Flow calculation (SparkStreaming
)、 Figure calculation (GraphX)、 machine learning (MLlib)
3 Spark What are the characteristics ?
Fast 、 Easy to use 、 Strong commonality 、 Multiple modes of operation
4 Hadoop Process based computing and Spark Advantages and disadvantages of thread based approach ?
Hadoop Medium MR Each of them map/reduce task It's all one java Run in process mode , The advantage is that processes are independent of each other , Every task Exclusive process resources , no
There is mutual interference , Easy to monitor , But the problem is task Inconvenient to share data between , The execution efficiency is relatively low . Like multiple map task Reading different data source files requires adding
To each map task in , Cause repeated loading and waste of memory . Thread based computing is to share data and improve execution efficiency ,Spark The minimum execution of threads is adopted
Company , But the disadvantage is that there will be resource competition between threads
5 Spark The mode of operation of ?
Local mode
• Cluster pattern (StandAlone、YARN、K8S)
• Cloud model
6 Spark Operation role of ( contrast YARN)?
• Master: Cluster resource management ( similar ResourceManager)
• Worker: Stand alone resource management ( similar NodeManager)
• Driver: Single task manager ( similar ApplicationMaster)
• Executor: Single task performer ( similar YARN Inside the container Task)
Basic concepts of threads
Thread is CPU The basic dispatching unit of
A process usually contains multiple threads , Multiple threads under a process share the resources of the process
Threads between different processes are not visible to each other
Threads cannot execute independently
One thread can create and undo another thread
边栏推荐
- On my first day at work, this API timeout optimization put me down!
- gslb(global server load balance)技術的一點理解
- Data consistency between redis and database
- 4. Data splitting of Flink real-time project
- 6.0 kernel driver character driver
- Teach you how to install aidlux (1 installation)
- Great gods, I want to send two broadcast streams: 1. Load basic data from MySQL and 2. Load changes in basic data from Kafka
- JS Demo calcule combien de jours il reste de l'année
- Dahua series books
- IPhone development swift foundation 09 assets
猜你喜欢

Pooling idea: string constant pool, thread pool, database connection pool
![[flax high frequency question] leetcode 426 Convert binary search tree to sorted double linked list](/img/db/b992d2b461ca17652518a1511b4947.gif)
[flax high frequency question] leetcode 426 Convert binary search tree to sorted double linked list

Data consistency between redis and database

Collection | pytoch common loss function disassembly

Redis concludes that the second pipeline publishes / subscribes to bloom filter redis as a database and caches RDB AOF redis configuration files

Buuctf, misc: n solutions

Yyds dry inventory hcie security Day12: concept of supplementary package filtering and security policy

2022 electrician (elementary) examination questions and electrician (elementary) registration examination

Morning flowers and evening flowers

On my first day at work, this API timeout optimization put me down!
随机推荐
Global and Chinese market of wall mounted kiosks 2022-2028: Research Report on technology, participants, trends, market size and share
What indicators should be paid attention to in current limit monitoring?
[sg function] lightoj Partitioning Game
How PHP drives mongodb
How PHP adds two numbers
Morning flowers and evening flowers
Introduction to kubernetes
The 14th five year plan for the construction of Chinese Enterprise Universities and the feasibility study report on investment Ⓓ 2022 ~ 2028
Asynchronous artifact: implementation principle and usage scenario of completable future
Is it safe and reliable to open an account and register for stock speculation? Is there any risk?
使用dnSpy對無源碼EXE或DLL進行反編譯並且修改
[dynamic programming] Ji Suan Ke: Suan tou Jun breaks through the barrier (variant of the longest increasing subsequence)
Buuctf, misc: sniffed traffic
WiFi 2.4g/5g/6g channel distribution
Plug - in Oil Monkey
Rest reference
How PHP gets all method names of objects
2022 safety officer-a certificate registration examination and summary of safety officer-a certificate examination
Rest参考
Cognitive fallacy: Wittgenstein's ruler