当前位置:网站首页>Spark Structured Streaming HelloWorld
Spark Structured Streaming HelloWorld
2022-07-26 04:29:00 【Actually, I have a true disposition】
Spark Structured Streaming HelloWorld
Preface
Spark Structured Streaming+Kafka+Hbase Scala Version tutorial , The whole entrance .
Text
1.Spark Version selection
Select the corresponding version of your server ; Document address :
https://spark.apache.org/docs/
This address is opened with the version number , Choose your own environment Spark That's all right. ;
So here I'm going to use theta 2.4.5; The latest version of the document is 3.3.3
2. The official example
After entering the corresponding version, you can find Spark The main function of , Here's the picture
Spark Streaming It has been clearly marked as old API 了 , new API Namely Structured Streaming, The picture is circled in red , So what I am currently using is new API.Structured Streaming
HelloWorld Code
The official one is simple word count Example
// Create DataFrame representing the stream of input lines from connection to localhost:9999
val lines = spark.readStream
.format("socket")
.option("host", "localhost")
.option("port", 9999)
.load()
// Split the lines into words
val words = lines.as[String].flatMap(_.split(" "))
// Generate running word count
val wordCounts = words.groupBy("value").count()
Batch code example
The official example , Here is my understanding ,streamingDF Is the data of a batch ;foreachBatch Is to cycle each batch ; The data in the batch is batchDF, Print batch number batchId You can see that this batch number is a self increasing number ;
streamingDF.writeStream.foreachBatch {
(batchDF: DataFrame, batchId: Long) =>
// This line is for caching , In this way, subsequent operations will not repeat the previous transform Operation
batchDF.persist()
// Operate the data in a batch , The specific basis is what operation is written differently
batchDF.write.format(...).save(...) // location 1
batchDF.write.format(...).save(...) // location 2
// The cache must be released after completion
batchDF.unpersist()
}
边栏推荐
- 香甜的黄油
- 机器学习之桑基图(用于用户行为分析)
- Acwing刷题
- Analyzing the curriculum design evaluation system of steam Education
- Makefile knowledge rearrangement (super detailed)
- The difference between positive samples, negative samples, simple samples and difficult samples in deep learning (simple and easy to understand)
- How to write the introduction and conclusion of an overview paper?
- Swiftui one day crash
- Optimization analysis and efficiency execution of MySQL
- Comprehensive evaluation and decision-making method
猜你喜欢
随机推荐
解决:RuntimeError: Expected object of scalar type Int but got scalar type Double
Yadi started to slow down after high-end
FFmpeg 视频编码
MySQL only checks the reasons for the slow execution of one line statements
1. Excel的IF函数
Steam science education endows classroom teaching with creativity
idea插件离线安装(持续更新)
VM virtual machine has no un bridged host network adapter, unable to restore the default configuration
How does win11 change the power mode? Win11 method of changing power mode
Soft simulation rasterization renderer
The auxiliary role of rational cognitive educational robot in teaching and entertainment
7、 Restful
Function knowledge points
How to write the abbreviation of the thesis?
旋转数组最小数字
Steam科学教育赋予课堂教学的创造力
[C language foundation] 13 preprocessor
生活相关——十年的职业历程(转)
Getting started with mongodb Basics
How to make your language academic when writing a thesis? Just remember four sentences!





![[C language foundation] 13 preprocessor](/img/4c/ab25d88e9a0cf29bde6e33a2b14225.jpg)



