当前位置:网站首页>Spark实战1:单节点本地模式搭建Spark运行环境
Spark实战1:单节点本地模式搭建Spark运行环境
2022-07-03 12:39:00 【星哥玩云】
前言:
Spark本身用scala写的,运行在JVM之上。
JAVA版本:java 6 /higher edition.
1 下载Spark
http://spark.apache.org/downloads.html
你可以自己选择需要的版本,这里我的选择是:
http://d3kbcqa49mib13.cloudfront.net/spark-1.1.0-bin-hadoop1.tgz
如果你是奋发图强的好码农,你可以自己下载源码:http://github.com/apache/spark.
注意:我这里是运行在Linux环境下。没有条件的可以安装下虚拟机之上!
2 解压缩&进入目录
tar -zvxf spark-1.1.0-bin-Hadoop1.tgz
cd spark-1.1.0-bin-hadoop1/
3 启动shell
./bin/spark-shell
你会看到打印很多东西,最后显示
4 小试牛刀
先后执行下面几个语句
val lines = sc.textFile("README.md")
lines.count()
lines.first()
val pythonLines = lines.filter(line => line.contains("Python"))
scala> lines.first() res0: String = ## Interactive Python Shel
---解释,什么是sc
sc是默认产生的SparkContext对象。
比如
scala> sc res13: org.apache.spark.SparkContext = [email protected]
这里只是本地运行,先提前了解下分布式计算的示意图:
5 独立的程序
最后以一个例子结束本节
为了让它顺利运行,按照以下步骤来实施即可:
--------------目录结构如下:
/usr/local/spark-1.1.0-bin-hadoop1/test$ find . . ./src ./src/main ./src/main/scala ./src/main/scala/example.scala ./simple.sbt
然后simple.sbt的内容如下:
name := "Simple Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.0"
example.scala的内容如下:
import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.SparkContext._
object example { def main(args: Array[String]) { val conf = new SparkConf().setMaster("local").setAppName("My App") val sc = new SparkContext("local", "My App") sc.stop() //System.exit(0) //sys.exit() println("this system exit ok!!!") } }
红色local:一个集群的URL,这里是local,告诉spark如何连接一个集群,local表示在本机上以单线程运行而不需要连接到某个集群。
橙黄My App:一个项目的名字,
然后执行:sbt package
成功之后执行
./bin/spark-submit --class "example" ./target/scala-2.10/simple-project_2.10-1.0.jar
结果如下:
说明确实成功执行了!
结束!
边栏推荐
- Flink SQL knows why (VIII): the wonderful way to parse Flink SQL tumble window
- Create a dojo progress bar programmatically: Dojo ProgressBar
- Flink SQL knows why (17): Zeppelin, a sharp tool for developing Flink SQL
- mysqlbetween实现选取介于两个值之间的数据范围
- PowerPoint 教程,如何在 PowerPoint 中将演示文稿另存为视频?
- February 14, 2022, incluxdb survey - mind map
- Sitescms v3.0.2 release, upgrade jfinal and other dependencies
- stm32和电机开发(从mcu到架构设计)
- 有限状态机FSM
- R语言使用data函数获取当前R环境可用的示例数据集:获取datasets包中的所有示例数据集、获取所有包的数据集、获取特定包的数据集
猜你喜欢
Idea full text search shortcut ctr+shift+f failure problem
elk笔记24--用gohangout替代logstash消费日志
Huffman coding experiment report
PowerPoint tutorial, how to save a presentation as a video in PowerPoint?
【R】【密度聚类、层次聚类、期望最大化聚类】
STM32 and motor development (from MCU to architecture design)
Tutoriel PowerPoint, comment enregistrer une présentation sous forme de vidéo dans Powerpoint?
剑指 Offer 14- II. 剪绳子 II
道路建设问题
[Database Principle and Application Tutorial (4th Edition | wechat Edition) Chen Zhibo] [Chapter 6 exercises]
随机推荐
阿南的疑惑
2022-02-14 incluxdb cluster write data writetoshard parsing
【数据库原理及应用教程(第4版|微课版)陈志泊】【第六章习题】
Kotlin - improved decorator mode
Flink SQL knows why (VIII): the wonderful way to parse Flink SQL tumble window
Sword finger offer 15 Number of 1 in binary
道路建设问题
Sword finger offer 14- I. cut rope
Today's sleep quality record 77 points
PostgreSQL installation
Flink SQL knows why (19): the transformation between table and datastream (with source code)
Elk note 24 -- replace logstash consumption log with gohangout
DQL basic query
人身变声器的原理
The 35 required questions in MySQL interview are illustrated, which is too easy to understand
Will Huawei be the next one to fall
mysql更新时条件为一查询
Differences and connections between final and static
Flink SQL knows why (XV): changed the source code and realized a batch lookup join (with source code attached)
Logback log framework