当前位置:网站首页>Spark DF增加一列
Spark DF增加一列
2022-07-06 00:23:00 【南风知我意丿】
文章目录
方法一:利用createDataFrame方法,新增列的过程包含在构建rdd和schema中
val trdd = input.select(targetColumns).rdd.map(x=>{
if (x.get(0).toString().toDouble > critValueR || x.get(0).toString().toDouble < critValueL)
Row(x.get(0).toString().toDouble,"F")
else Row(x.get(0).toString().toDouble,"T")
})
val schema = input.select(targetColumns).schema.add("flag", StringType, true)
val sample3 = ss.createDataFrame(trdd, schema).distinct().withColumnRenamed(targetColumns, "idx")
方法二:利用withColumn方法,新增列的过程包含在udf函数中
val code :(Int => String) = (arg: Int) => {
if (arg > critValueR || arg < critValueL) "F" else "T"}
val addCol = udf(code)
val sample3 = input.select(targetColumns).withColumn("flag", addCol(input(targetColumns)))
.withColumnRenamed(targetColumns, "idx")
方法三:利用SQL代码,新增列的过程直接写入SQL代码中
input.select(targetColumns).createOrReplaceTempView("tmp")
val sample3 = ss.sqlContext.sql("select distinct "+targetColname+
" as idx,case when "+targetColname+">"+critValueR+" then 'F'"+
" when "+targetColname+"<"+critValueL+" then 'F' else 'T' end as flag from tmp")
方法四:以上三种是增加一个有判断的列,如果想要增加一列唯一序号,可以使用monotonically_increasing_id
//添加序号列新增一列方法4
import org.apache.spark.sql.functions.monotonically_increasing_id
val inputnew = input.withColumn("idx", monotonically_increasing_id)
边栏推荐
- Permission problem: source bash_ profile permission denied
- AtCoder Beginner Contest 254【VP记录】
- Browser local storage
- Classic CTF topic about FTP protocol
- 关于slmgr命令的那些事
- Huawei equipment is configured with OSPF and BFD linkage
- 什么叫做信息安全?包含哪些内容?与网络安全有什么区别?
- 提升工作效率工具:SQL批量生成工具思想
- 时间戳的拓展及应用实例
- SQLServer连接数据库读取中文乱码问题解决
猜你喜欢

Recognize the small experiment of extracting and displaying Mel spectrum (observe the difference between different y_axis and x_axis)

Key structure of ffmpeg - avformatcontext

Determinant learning notes (I)
![Atcoder beginer contest 254 [VP record]](/img/13/656468eb76bb8b6ea3b6465a56031d.png)
Atcoder beginer contest 254 [VP record]

【EI会议分享】2022年第三届智能制造与自动化前沿国际会议(CFIMA 2022)

Intranet Security Learning (V) -- domain horizontal: SPN & RDP & Cobalt strike

Configuring OSPF GR features for Huawei devices

数据分析思维分析方法和业务知识——分析方法(二)

NSSA area where OSPF is configured for Huawei equipment

Permission problem: source bash_ profile permission denied
随机推荐
The global and Chinese markets of dial indicator calipers 2022-2028: Research Report on technology, participants, trends, market size and share
【NOI模拟赛】Anaid 的树(莫比乌斯反演,指数型生成函数,埃氏筛,虚树)
Go learning --- structure to map[string]interface{}
[designmode] Decorator Pattern
LeetCode 6004. Get operands of 0
LeetCode 6006. Take out the least number of magic beans
剖面测量之提取剖面数据
Huawei equipment is configured with OSPF and BFD linkage
LeetCode 6005. The minimum operand to make an array an alternating array
Doppler effect (Doppler shift)
Room cannot create an SQLite connection to verify the queries
Analysis of the combination of small program technology advantages and industrial Internet
【QT】Qt使用QJson生成json文件并保存
关于结构体所占内存大小知识
Single source shortest path exercise (I)
LeetCode 8. String conversion integer (ATOI)
[Chongqing Guangdong education] reference materials for Zhengzhou Vocational College of finance, taxation and finance to play around the E-era
JS can really prohibit constant modification this time!
NSSA area where OSPF is configured for Huawei equipment
Mathematical model Lotka Volterra