当前位置:网站首页>Spark SQL null value, Nan judgment and processing
Spark SQL null value, Nan judgment and processing
2022-07-06 00:28:00 【The south wind knows what I mean】
Spark SQL Null value Null,NaN Judge and deal with
Null and NaN
null It means nothing 、 Nonexistent or invalid object or address reference . It can be converted into 0, It's a global object .null ==false The value returned is false.
undefined It's a global property , Original value undefined. It tells us that some things have no assignment , No definition .undefined Cannot convert to any number , So use it in mathematical calculations , The return is NaN.
val d: Double = math.sqrt(-1.0)
println(d)
val n: Boolean = math.sqrt(-1.0).isNaN
println(n)

Spark SQL Null value Null,NaN Judge and deal with
val df: DataFrame = session.sql(
s""" |select * from sparktuning.course_pay1 |""".stripMargin)
// Delete null values and... For all columns NaN
val resNull=data1.na.drop()
resNull.limit(10).show()
+-------+------+---+------------+--------+-------------+---------+----------+------+
|affairs|gender|age|yearsmarried|children|religiousness|education|occupation|rating|
+-------+------+---+------------+--------+-------------+---------+----------+------+
| 0| male| 37| 10| no| 3| 18| 7| 4|
| 0| male| 57| 15| yes| 2| 14| 4| 4|
| 0|female| 32| 15| yes| 4| 16| 1| 2|
| 0| male| 22| 1.5| no| 4| 14| 4| 5|
| 0| male| 37| 15| yes| 2| 20| 7| 2|
| 0| male| 27| 4| yes| 4| 18| 6| 4|
| 0| male| 47| 15| yes| 5| 17| 6| 4|
| 0|female| 22| 1.5| no| 2| 17| 5| 4|
| 0|female| 27| 4| no| 4| 14| 5| 4|
| 0|female| 37| 15| yes| 1| 17| 5| 5|
+-------+------+---+------------+--------+-------------+---------+----------+------+
// Delete the null value and... Of a column NaN
val res=data1.na.drop(Array("gender","yearsmarried"))
// Delete a column that is not empty and not NaN Below 10 Of -- Note the field type
data1.na.drop(10,Array("gender","yearsmarried"))
// Fill in all null values [Boolean] The column of -- Note the field type
df.na.fill(false,Array("courseid"))
// Fill in all null Columns
val res123=data1.na.fill("wangxiao123")
res123.limit(10).show()
+-------+-----------+---+------------+--------+-------------+---------+----------+-----------+
|affairs| gender|age|yearsmarried|children|religiousness|education|occupation| rating|
+-------+-----------+---+------------+--------+-------------+---------+----------+-----------+
| 0| male| 37| 10| no| 3| 18| 7| 4|
| 0|wangxiao123| 27| wangxiao123| no| 4| 14| 6|wangxiao123|
| 0|wangxiao123| 32| wangxiao123| yes| 1| 12| 1|wangxiao123|
| 0|wangxiao123| 57| wangxiao123| yes| 5| 18| 6|wangxiao123|
| 0|wangxiao123| 22| wangxiao123| no| 2| 17| 6|wangxiao123|
| 0|wangxiao123| 32| wangxiao123| no| 2| 17| 5|wangxiao123|
| 0| female| 22| wangxiao123| no| 2| 12| 1|wangxiao123|
| 0| male| 57| 15| yes| 2| 14| 4| 4|
| 0| female| 32| 15| yes| 4| 16| 1| 2|
| 0| male| 22| 1.5| no| 4| 14| 4| 5|
+-------+-----------+---+------------+--------+-------------+---------+----------+-----------+
// Fill in the control of the specified column -- Multiple columns with the same value
df1.na.fill(123456,cols = Array("courseid","pointlistid")).show(false)
+---------+--------+-------+-----+-----------+--------+----+
|chapterid|courseid|majorid|money|pointlistid|dt |dn |
+---------+--------+-------+-----+-----------+--------+----+
|4 |123456 |5 |100 |3 |20190722|webA|
|7 |123456 |7 |100 |1 |20190722|webA|
|8 |123456 |3 | |8 |20190722|webA|
|5 |14 |3 |100 |123456 |20190722|webA|
|4 |15 |2 |100 |3 |20190722|webA|
|9 |123456 |8 |100 |7 |20190722|webA|
|7 |17 |7 |100 |123456 |20190722|webA|
|0 |18 |9 | |7 |20190722|webA|
|5 |123456 |8 |100 |4 |20190722|webA|
|4 |20 |1 |100 |123456 |20190722|webA|
|4 |123456 |5 |100 |1 |20190722|webA|
|0 |22 |3 |100 |9 |20190722|webA|
|1 |123456 |8 |100 |0 |20190722|webA|
|4 |24 |0 |100 |5 |20190722|webA|
|9 |123456 |9 |100 |0 |20190722|webA|
+---------+--------+-------+-----+-----------+--------+----+
// Fill in the control of the specified column -- Multiple columns of different values
df1.na.fill(Map("courseid"->123456,"pointlistid"->654321)).show(false)
+---------+--------+-------+-----+-----------+--------+----+
|chapterid|courseid|majorid|money|pointlistid|dt |dn |
+---------+--------+-------+-----+-----------+--------+----+
|4 |123456 |5 |100 |3 |20190722|webA|
|7 |123456 |7 |100 |1 |20190722|webA|
|8 |123456 |3 | |8 |20190722|webA|
|5 |14 |3 |100 |654321 |20190722|webA|
|4 |15 |2 |100 |3 |20190722|webA|
|9 |123456 |8 |100 |7 |20190722|webA|
|7 |17 |7 |100 |654321 |20190722|webA|
|0 |18 |9 | |7 |20190722|webA|
|5 |123456 |8 |100 |4 |20190722|webA|
|4 |20 |1 |100 |654321 |20190722|webA|
|4 |123456 |5 |100 |1 |20190722|webA|
|0 |22 |3 |100 |9 |20190722|webA|
|1 |123456 |8 |100 |0 |20190722|webA|
|4 |24 |0 |100 |5 |20190722|webA|
|9 |123456 |9 |100 |0 |20190722|webA|
+---------+--------+-------+-----+-----------+--------+----+
// Query null column
data1.filter("gender is null").select("gender").limit(10).show
+------+
|gender|
+------+
| null|
| null|
| null|
| null|
| null|
+------+
data1.filter("gender is not null").select("gender").limit(10).show
+------+
|gender|
+------+
| male|
|female|
| male|
|female|
| male|
| male|
| male|
| male|
|female|
|female|
+------+
data1.filter( data1("gender").isNull ).select("gender").limit(10).show
+------+
|gender|
+------+
| null|
| null|
| null|
| null|
| null|
+------+
data1.filter("gender<>''").select("gender").limit(10).show
+------+
|gender|
+------+
| male|
|female|
| male|
|female|
| male|
| male|
| male|
| male|
|female|
|female|
+------+
边栏推荐
- Permission problem: source bash_ profile permission denied
- 【NOI模拟赛】Anaid 的树(莫比乌斯反演,指数型生成函数,埃氏筛,虚树)
- 提升工作效率工具:SQL批量生成工具思想
- 硬件及接口学习总结
- State mode design procedure: Heroes in the game can rest, defend, attack normally and attack skills according to different physical strength values.
- 选择致敬持续奋斗背后的精神——对话威尔价值观【第四期】
- Huawei equipment configuration ospf-bgp linkage
- LeetCode 6005. The minimum operand to make an array an alternating array
- 【文件IO的简单实现】
- 认识提取与显示梅尔谱图的小实验(观察不同y_axis和x_axis的区别)
猜你喜欢
![Choose to pay tribute to the spirit behind continuous struggle -- Dialogue will values [Issue 4]](/img/d8/a367c26b51d9dbaf53bf4fe2a13917.png)
Choose to pay tribute to the spirit behind continuous struggle -- Dialogue will values [Issue 4]

Determinant learning notes (I)

About the slmgr command

【DesignMode】组合模式(composite mode)

Extracting profile data from profile measurement

Spark SQL空值Null,NaN判断和处理

Set data real-time update during MDK debug

Calculate sha256 value of data or file based on crypto++

Uniapp development, packaged as H5 and deployed to the server

Room cannot create an SQLite connection to verify the queries
随机推荐
【线上小工具】开发过程中会用到的线上小工具合集
MySQL functions
数据分析思维分析方法和业务知识——分析方法(二)
Intranet Security Learning (V) -- domain horizontal: SPN & RDP & Cobalt strike
Problems encountered in the database
2022.7.5-----leetcode. seven hundred and twenty-nine
[QT] QT uses qjson to generate JSON files and save them
FFMPEG关键结构体——AVCodecContext
LeetCode 1598. Folder operation log collector
Key structure of ffmpeg -- AVCodecContext
Spark AQE
Analysis of the combination of small program technology advantages and industrial Internet
What is information security? What is included? What is the difference with network security?
Data analysis thinking analysis methods and business knowledge -- analysis methods (II)
MySQL global lock and table lock
Idea远程提交spark任务到yarn集群
Global and Chinese markets of POM plastic gears 2022-2028: Research Report on technology, participants, trends, market size and share
STM32按键消抖——入门状态机思维
Location based mobile terminal network video exploration app system documents + foreign language translation and original text + guidance records (8 weeks) + PPT + review + project source code
关于slmgr命令的那些事