当前位置:网站首页>Spark DF adds a column
Spark DF adds a column
2022-07-06 00:28:00 【The south wind knows what I mean】
List of articles
- Method 1 : utilize createDataFrame Method , The process of adding new columns is included in building rdd and schema in
- Method 2 : utilize withColumn Method , The process of adding new columns is included in udf Function
- Method 3 : utilize SQL Code , The process of adding new columns is written directly to SQL In the code
- Method four : The above three are to add a judged column , If you want to add a unique sequence number , have access to monotonically_increasing_id
Method 1 : utilize createDataFrame Method , The process of adding new columns is included in building rdd and schema in
val trdd = input.select(targetColumns).rdd.map(x=>{
if (x.get(0).toString().toDouble > critValueR || x.get(0).toString().toDouble < critValueL)
Row(x.get(0).toString().toDouble,"F")
else Row(x.get(0).toString().toDouble,"T")
})
val schema = input.select(targetColumns).schema.add("flag", StringType, true)
val sample3 = ss.createDataFrame(trdd, schema).distinct().withColumnRenamed(targetColumns, "idx")
Method 2 : utilize withColumn Method , The process of adding new columns is included in udf Function
val code :(Int => String) = (arg: Int) => {
if (arg > critValueR || arg < critValueL) "F" else "T"}
val addCol = udf(code)
val sample3 = input.select(targetColumns).withColumn("flag", addCol(input(targetColumns)))
.withColumnRenamed(targetColumns, "idx")
Method 3 : utilize SQL Code , The process of adding new columns is written directly to SQL In the code
input.select(targetColumns).createOrReplaceTempView("tmp")
val sample3 = ss.sqlContext.sql("select distinct "+targetColname+
" as idx,case when "+targetColname+">"+critValueR+" then 'F'"+
" when "+targetColname+"<"+critValueL+" then 'F' else 'T' end as flag from tmp")
Method four : The above three are to add a judged column , If you want to add a unique sequence number , have access to monotonically_increasing_id
// Add sequence number column add a column method 4
import org.apache.spark.sql.functions.monotonically_increasing_id
val inputnew = input.withColumn("idx", monotonically_increasing_id)
边栏推荐
- [noi simulation] Anaid's tree (Mobius inversion, exponential generating function, Ehrlich sieve, virtual tree)
- FFMPEG关键结构体——AVFrame
- 2022-02-13 work record -- PHP parsing rich text
- [designmode] adapter pattern
- Atcoder beginer contest 258 [competition record]
- Spark AQE
- MDK debug时设置数据实时更新
- Global and Chinese market of valve institutions 2022-2028: Research Report on technology, participants, trends, market size and share
- 从底层结构开始学习FPGA----FIFO IP核及其关键参数介绍
- Knowledge about the memory size occupied by the structure
猜你喜欢

硬件及接口学习总结

提升工作效率工具:SQL批量生成工具思想

XML配置文件

Anconda download + add Tsinghua +tensorflow installation +no module named 'tensorflow' +kernelrestart: restart failed, kernel restart failed

Idea远程提交spark任务到yarn集群

Room cannot create an SQLite connection to verify the queries

Ffmpeg captures RTSP images for image analysis

【NOI模拟赛】Anaid 的树(莫比乌斯反演,指数型生成函数,埃氏筛,虚树)

Teach you to run uni app with simulator on hbuilderx, conscience teaching!!!

2022-02-13 work record -- PHP parsing rich text
随机推荐
多线程与高并发(8)—— 从CountDownLatch总结AQS共享锁(三周年打卡)
如何制作自己的机器人
硬件及接口学习总结
Teach you to run uni app with simulator on hbuilderx, conscience teaching!!!
选择致敬持续奋斗背后的精神——对话威尔价值观【第四期】
QT -- thread
Global and Chinese market of digital serial inverter 2022-2028: Research Report on technology, participants, trends, market size and share
OpenCV经典100题
DEJA_ Vu3d - cesium feature set 055 - summary description of map service addresses of domestic and foreign manufacturers
Spark获取DataFrame中列的方式--col,$,column,apply
MySQL global lock and table lock
Intranet Security Learning (V) -- domain horizontal: SPN & RDP & Cobalt strike
Key structure of ffmpeg - avframe
Permission problem: source bash_ profile permission denied
Calculate sha256 value of data or file based on crypto++
Introduction of motor
Hudi of data Lake (2): Hudi compilation
Data analysis thinking analysis methods and business knowledge - analysis methods (III)
小程序技术优势与产业互联网相结合的分析
FFmpeg学习——核心模块