当前位置:网站首页>Spark DF增加一列
Spark DF增加一列
2022-07-06 00:23:00 【南风知我意丿】
文章目录
方法一:利用createDataFrame方法,新增列的过程包含在构建rdd和schema中
val trdd = input.select(targetColumns).rdd.map(x=>{
if (x.get(0).toString().toDouble > critValueR || x.get(0).toString().toDouble < critValueL)
Row(x.get(0).toString().toDouble,"F")
else Row(x.get(0).toString().toDouble,"T")
})
val schema = input.select(targetColumns).schema.add("flag", StringType, true)
val sample3 = ss.createDataFrame(trdd, schema).distinct().withColumnRenamed(targetColumns, "idx")
方法二:利用withColumn方法,新增列的过程包含在udf函数中
val code :(Int => String) = (arg: Int) => {
if (arg > critValueR || arg < critValueL) "F" else "T"}
val addCol = udf(code)
val sample3 = input.select(targetColumns).withColumn("flag", addCol(input(targetColumns)))
.withColumnRenamed(targetColumns, "idx")
方法三:利用SQL代码,新增列的过程直接写入SQL代码中
input.select(targetColumns).createOrReplaceTempView("tmp")
val sample3 = ss.sqlContext.sql("select distinct "+targetColname+
" as idx,case when "+targetColname+">"+critValueR+" then 'F'"+
" when "+targetColname+"<"+critValueL+" then 'F' else 'T' end as flag from tmp")
方法四:以上三种是增加一个有判断的列,如果想要增加一列唯一序号,可以使用monotonically_increasing_id
//添加序号列新增一列方法4
import org.apache.spark.sql.functions.monotonically_increasing_id
val inputnew = input.withColumn("idx", monotonically_increasing_id)
边栏推荐
- The global and Chinese markets of dial indicator calipers 2022-2028: Research Report on technology, participants, trends, market size and share
- Permission problem: source bash_ profile permission denied
- notepad++正则表达式替换字符串
- MySQL之函数
- anconda下载+添加清华+tensorflow 安装+No module named ‘tensorflow‘+KernelRestarter: restart failed,内核重启失败
- Shardingsphere source code analysis
- Global and Chinese market of valve institutions 2022-2028: Research Report on technology, participants, trends, market size and share
- USB Interface USB protocol
- 【DesignMode】装饰者模式(Decorator pattern)
- Browser local storage
猜你喜欢
[noi simulation] Anaid's tree (Mobius inversion, exponential generating function, Ehrlich sieve, virtual tree)
Knowledge about the memory size occupied by the structure
常用API类及异常体系
FFT learning notes (I think it is detailed)
XML Configuration File
Configuring OSPF load sharing for Huawei devices
AtCoder Beginner Contest 254【VP记录】
FPGA内部硬件结构与代码的关系
免费的聊天机器人API
Key structure of ffmpeg -- AVCodecContext
随机推荐
NSSA area where OSPF is configured for Huawei equipment
2022-02-13 work record -- PHP parsing rich text
[binary search tree] add, delete, modify and query function code implementation
Mysql - CRUD
Analysis of the combination of small program technology advantages and industrial Internet
Codeforces round 804 (Div. 2) [competition record]
DEJA_VU3D - Cesium功能集 之 055-国内外各厂商地图服务地址汇总说明
常用API类及异常体系
Search (DFS and BFS)
Date类中日期转成指定字符串出现的问题及解决方法
LeetCode 8. String conversion integer (ATOI)
What are Yunna's fixed asset management systems?
MySql——CRUD
如何利用Flutter框架开发运行小程序
Priority queue (heap)
Global and Chinese markets for pressure and temperature sensors 2022-2028: Research Report on technology, participants, trends, market size and share
MySQL functions
Teach you to run uni app with simulator on hbuilderx, conscience teaching!!!
Go learning --- structure to map[string]interface{}
Single merchant v4.4 has the same original intention and strength!