当前位置:网站首页>Spark calculation operator and some small details in liunx
Spark calculation operator and some small details in liunx
2022-07-06 17:39:00 【Bald Second Senior brother】
Spark -map operator
map operator :
object Spark01_Oper { def main(args: Array[String]): Unit = { val conf = new SparkConf().setMaster("local[*]").setAppName("Value") val cs = new SparkContext(conf) val make = cs.makeRDD(1 to 10) //map operator val mapRdd = make.map(x => x * 2) mapRdd.collect().foreach(println) } }
map Operators are used to calculate the data in all incoming partitions one by one .
mapPartRdd operator
object Spark02_OPer { def main(args: Array[String]): Unit = { val conf = new SparkConf().setMaster("local[*]").setAppName("mapPart") val sc = new SparkContext(conf) //map operator val list = sc.makeRDD(1 to 10) val mapPartRdd = list.mapPartitions(datas => {datas.map(data => data*2)}) mapPartRdd.collect().foreach(println) } }
mapPartRdd operator Be similar to map But it calculates data by partition , The output value of his calculation is a list
mapPartitionsWithIndex operator
object Spark03_OPer {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setMaster("local[*]").setAppName("With")
val sc = new SparkContext(conf)
val list = sc.makeRDD(1 to 10,2)
val indexRDD = list.mapPartitionsWithIndex {
case (num, datas) => {
datas.map((_," Zone number :"+num))
}
}
indexRDD.collect().foreach(println)
}
}
mapPartitionsWithIndex The operator is similar to mapPartiyions But in func There will be an index value representing the partition , therefore func There will be one more function similar to Int.
spark Possible problems :
Because every time the calculation data will produce new data, but it will not be deleted , Accumulating all the time will cause memory overflow (OOM)
Driver And Executor The difference between
Driver:
Driver Just create Spark The classes of context objects can be said to be Driver,Driver yes Spark in Application That is, the code release program , It can be understood that it is written for us spark The main program of the code , Secondly, he is also responsible for Executor To allocate tasks ,Driver There can only be one
Executor:
Executor yes Spark In charge of resource calculation , He can exist in multiple .
difference :
Drvier Like a boss , and Executor yes Driver The hands of ,Driver Be responsible for assigning tasks to Executor To execute .
Linux Pick up
linux Method of switching on and off
1. To turn it off : shutdown -h restart :shutdown -r
2. To turn it off : inti -0 restart : init -6
3. To turn it off : poweroff restart :reboot
service And systemctl The difference between
service:
You can start 、 stop it 、 Restart and shut down system services , It can also display the current status of all system services ,service The function of the command is to /etc/init.d Find the corresponding service under the directory , Open and close
systemctl:
It's a systemd Tools , Mainly responsible for control systemd System and service manager , yes service and chkconfig The combination of orders
Operation of network equipment :
Environment variable loading order
边栏推荐
- The art of Engineering (1): try to package things that do not need to be exposed
- Program counter of JVM runtime data area
- Serial serialold parnew of JVM garbage collector
- C#WinForm中的dataGridView滚动条定位
- C WinForm series button easy to use
- DataGridView scroll bar positioning in C WinForm
- Xin'an Second Edition; Chapter 11 learning notes on the principle and application of network physical isolation technology
- Openharmony developer documentation open source project
- Application service configurator (regular, database backup, file backup, remote backup)
- TCP连接不止用TCP协议沟通
猜你喜欢
Development and practice of lightweight planning service tools
TCP连接不止用TCP协议沟通
【MySQL入门】第四话 · 和kiko一起探索MySQL中的运算符
[reverse intermediate] eager to try
Flink parsing (VII): time window
pip install pyodbc : ERROR: Command errored out with exit status 1
02个人研发的产品及推广-短信平台
03 products and promotion developed by individuals - plan service configurator v3.0
【ASM】字节码操作 ClassWriter 类介绍与使用
BearPi-HM_ Nano development board "flower protector" case
随机推荐
基于Infragistics.Document.Excel导出表格的类
Grafana 9 is officially released, which is easier to use and more cool!
Development and practice of lightweight planning service tools
Error: Publish of Process project to Orchestrator failed. The operation has timed out.
04个人研发的产品及推广-数据推送工具
信息与网络安全期末复习(基于老师给的重点)
Selenium test of automatic answer runs directly in the browser, just like real users.
Automatic operation and maintenance sharp weapon ansible Playbook
[rapid environment construction] openharmony 10 minute tutorial (cub pie)
Flexible report v1.0 (simple version)
The problem of "syntax error" when uipath executes insert statement is solved
pip install pyodbc : ERROR: Command errored out with exit status 1
yarn : 无法加载文件 D:\ProgramFiles\nodejs\yarn.ps1,因为在此系统上禁止运行脚本
TCP连接不止用TCP协议沟通
Xin'an Second Edition: Chapter 23 cloud computing security requirements analysis and security protection engineering learning notes
PySpark算子处理空间数据全解析(4): 先说说空间运算
Automatic operation and maintenance sharp weapon ansible Foundation
Yarn: unable to load file d:\programfiles\nodejs\yarn PS1, because running scripts is prohibited on this system
[reverse intermediate] eager to try
The art of Engineering (3): do not rely on each other between functions of code robustness