当前位置:网站首页>Spark Structured Streaming HelloWorld
Spark Structured Streaming HelloWorld
2022-07-26 04:23:00 【其实我是真性情】
Spark Structured Streaming HelloWorld
前言
Spark Structured Streaming+Kafka+Hbase Scala版教程,整体入口。
正文
1.Spark版本选择
选择你自己服务器对应的版本;文档地址:
https://spark.apache.org/docs/
这个地址打开都是版本号,选择自己环境里的Spark就可以了;
这里我用的是2.4.5;文档发布时间最新版是3.3.3
2.官方例子
进入对应版本之后可以在下边找到Spark的主要功能,如下图
Spark Streaming已经明确标明是老API了,新的API就是Structured Streaming,图里用红圈圈出来了,所以我当前用的就是新API。Structured Streaming
HelloWorld代码
官方的一个简单的word count例子
// Create DataFrame representing the stream of input lines from connection to localhost:9999
val lines = spark.readStream
.format("socket")
.option("host", "localhost")
.option("port", 9999)
.load()
// Split the lines into words
val words = lines.as[String].flatMap(_.split(" "))
// Generate running word count
val wordCounts = words.groupBy("value").count()
批处理代码例子
官方例子,这里说一下我的理解,streamingDF是一个批次的数据;foreachBatch就是循环每个批次;批次里的数据就在batchDF,打印批次号batchId就能看到这个批次号是个自增的数字;
streamingDF.writeStream.foreachBatch {
(batchDF: DataFrame, batchId: Long) =>
//这行是缓存一下,这样后续的操作不会重复的执行前边transform操作了
batchDF.persist()
//对一个批次里的数据进行操作,具体根据是什么操作写法不一样
batchDF.write.format(...).save(...) // location 1
batchDF.write.format(...).save(...) // location 2
//完事必须把缓存释放了
batchDF.unpersist()
}
边栏推荐
- p-范数(2-范数 即 欧几里得范数)
- How to write the introduction and conclusion of an overview paper?
- Wu Enda's machine learning after class exercises - logical regression
- Apisex's exploration in the field of API and microservices
- 使用百度飞桨EasyDL完成垃圾分类
- Threadpooltaskexecutor and ThreadPoolExecutor
- 机器学习之桑基图(用于用户行为分析)
- Huawei executives talk about the 35 year old crisis. How can programmers overcome the worry of age?
- Firewall command simple operation
- Scroll view pull-down refresh and pull-up load (bottom)
猜你喜欢

LeetCode:1184. 公交站间的距离————简单

Acwing_ 12. Find a specific solution for the knapsack problem_ dp

Support proxy direct connection to Oracle database, jumpserver fortress v2.24.0 release

Recommendation | DBT skills training manual: baby, you are the reason why you live

Life related - ten years of career experience (turn)

Makefile knowledge rearrangement (super detailed)

Li Kou daily question - day 42 -661. Picture smoother

p-范数(2-范数 即 欧几里得范数)

理性认知教育机器人寓教于乐的辅助作用

Which websites can I visit to check the latest medical literature?
随机推荐
1. Mx6u system migration-6-uboot graphical configuration
第三篇如何使用SourceTree提交代码
Compiled by egg serialize TS
生活相关——减少期待,更快乐
p-范数(2-范数 即 欧几里得范数)
Steam科学教育赋予课堂教学的创造力
Integrated architecture of performance and cost: modular architecture
egg-ts-sequelize-CLI
HelloWorld case analysis
Pathmatchingresourcepatternresolver parsing configuration file resource file
Makefile knowledge rearrangement (super detailed)
建设面向青少年的创客教育实验室
Low cost, fast and efficient construction of digital collection app and H5 system, professional development of scallop technology is more assured!
RTSP/Onvif协议视频平台EasyNVR服务一键升级功能的使用教程
How to transfer English documents to Chinese?
[binary tree] the longest interleaved path in a binary tree
1. Excel的IF函数
VM虚拟机 没有未桥接的主机网络适配器 无法还原默认配置
egg-ts-sequelize-CLI
ASP. Net core actionfilter filter details