当前位置:网站首页>RDD creation method of spark
RDD creation method of spark
2022-07-06 02:04:00 【Diligent ls】
stay Spark Created in RDD There are three ways to create : Create... From the collection RDD、 Create... From external storage RDD、 From the other RDD establish .
Creation time environment dependency
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0</version>
</dependency>
</dependencies>
<build>
<finalName>SparkCoreTest</finalName>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.4.6</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
1. Create... From collection
object createrdd {
def main(args: Array[String]): Unit = {
val conf: SparkConf = new SparkConf()
.setAppName("SparkCoreTest")
.setMaster("local[*]")
val sc: SparkContext = new SparkContext(conf)
// Use parallelize() establish rdd
//val rdd: RDD[Int] = sc.parallelize(Array(1,2,3,4,5,6))
// rdd.collect().foreach(println)
// Use makeRDD() establish rdd
val rdd1: RDD[Int] = sc.makeRDD(Array(1,2,3,4,5,6))
rdd1.collect().foreach(println)
sc.stop()
}
}
notes :makeRDD Not exactly equal to parallelize, In one of the refactoring methods ,makeRDD Added location information .
2. Create from a dataset of an external storage system
object crearedd2 {
def main(args: Array[String]): Unit = {
val conf: SparkConf = new SparkConf()
.setAppName("WC")
.setMaster("local[*]")
val sc: SparkContext = new SparkContext(conf)
val value: RDD[String] = sc.textFile("input")
value.foreach(println)
sc.stop()
}
}
3. From the other RDD establish
Mainly through a RDD After the calculation , And create new RDD.
边栏推荐
- Basic operations of databases and tables ----- default constraints
- Computer graduation design PHP college classroom application management system
- Regular expressions: examples (1)
- VIM usage guide
- Cadre du Paddle: aperçu du paddlelnp [bibliothèque de développement pour le traitement du langage naturel des rames volantes]
- PHP campus financial management system for computer graduation design
- 剑指 Offer 12. 矩阵中的路径
- Accelerating spark data access with alluxio in kubernetes
- 【Flask】官方教程(Tutorial)-part3:blog蓝图、项目可安装化
- Basic operations of database and table ----- set the fields of the table to be automatically added
猜你喜欢
Know MySQL database
Open source | Ctrip ticket BDD UI testing framework flybirds
Win10 add file extension
Jisuanke - t2063_ Missile interception
Accelerating spark data access with alluxio in kubernetes
Visualstudio2019 compilation configuration lastools-v2.0.0 under win10 system
【Flask】官方教程(Tutorial)-part1:项目布局、应用程序设置、定义和访问数据库
It's wrong to install PHP zbarcode extension. I don't know if any God can help me solve it. 7.3 for PHP environment
Computer graduation design PHP part-time recruitment management system for College Students
Initialize MySQL database when docker container starts
随机推荐
Basic operations of databases and tables ----- primary key constraints
Using SA token to solve websocket handshake authentication
C web page open WinForm exe
阿里测开面试题
剑指 Offer 12. 矩阵中的路径
genius-storage使用文档,一个浏览器缓存工具
[the most complete in the whole network] |mysql explain full interpretation
FTP server, ssh server (super brief)
2022年PMP项目管理考试敏捷知识点(8)
国家级非遗传承人高清旺《四大美人》皮影数字藏品惊艳亮相!
Kubernetes stateless application expansion and contraction capacity
Unity learning notes -- 2D one-way platform production method
Unreal browser plug-in
leetcode3、實現 strStr()
Concept of storage engine
Ali test Open face test
同一个 SqlSession 中执行两条一模一样的SQL语句查询得到的 total 数量不一样
【Flask】官方教程(Tutorial)-part3:blog蓝图、项目可安装化
How to upgrade kubernetes in place
Force buckle 9 palindromes