当前位置:网站首页>Spark SQL null value, Nan judgment and processing
Spark SQL null value, Nan judgment and processing
2022-07-06 00:28:00 【The south wind knows what I mean】
Spark SQL Null value Null,NaN Judge and deal with
Null and NaN
null It means nothing 、 Nonexistent or invalid object or address reference . It can be converted into 0, It's a global object .null ==false The value returned is false.
undefined It's a global property , Original value undefined. It tells us that some things have no assignment , No definition .undefined Cannot convert to any number , So use it in mathematical calculations , The return is NaN.
val d: Double = math.sqrt(-1.0)
println(d)
val n: Boolean = math.sqrt(-1.0).isNaN
println(n)

Spark SQL Null value Null,NaN Judge and deal with
val df: DataFrame = session.sql(
s""" |select * from sparktuning.course_pay1 |""".stripMargin)
// Delete null values and... For all columns NaN
val resNull=data1.na.drop()
resNull.limit(10).show()
+-------+------+---+------------+--------+-------------+---------+----------+------+
|affairs|gender|age|yearsmarried|children|religiousness|education|occupation|rating|
+-------+------+---+------------+--------+-------------+---------+----------+------+
| 0| male| 37| 10| no| 3| 18| 7| 4|
| 0| male| 57| 15| yes| 2| 14| 4| 4|
| 0|female| 32| 15| yes| 4| 16| 1| 2|
| 0| male| 22| 1.5| no| 4| 14| 4| 5|
| 0| male| 37| 15| yes| 2| 20| 7| 2|
| 0| male| 27| 4| yes| 4| 18| 6| 4|
| 0| male| 47| 15| yes| 5| 17| 6| 4|
| 0|female| 22| 1.5| no| 2| 17| 5| 4|
| 0|female| 27| 4| no| 4| 14| 5| 4|
| 0|female| 37| 15| yes| 1| 17| 5| 5|
+-------+------+---+------------+--------+-------------+---------+----------+------+
// Delete the null value and... Of a column NaN
val res=data1.na.drop(Array("gender","yearsmarried"))
// Delete a column that is not empty and not NaN Below 10 Of -- Note the field type
data1.na.drop(10,Array("gender","yearsmarried"))
// Fill in all null values [Boolean] The column of -- Note the field type
df.na.fill(false,Array("courseid"))
// Fill in all null Columns
val res123=data1.na.fill("wangxiao123")
res123.limit(10).show()
+-------+-----------+---+------------+--------+-------------+---------+----------+-----------+
|affairs| gender|age|yearsmarried|children|religiousness|education|occupation| rating|
+-------+-----------+---+------------+--------+-------------+---------+----------+-----------+
| 0| male| 37| 10| no| 3| 18| 7| 4|
| 0|wangxiao123| 27| wangxiao123| no| 4| 14| 6|wangxiao123|
| 0|wangxiao123| 32| wangxiao123| yes| 1| 12| 1|wangxiao123|
| 0|wangxiao123| 57| wangxiao123| yes| 5| 18| 6|wangxiao123|
| 0|wangxiao123| 22| wangxiao123| no| 2| 17| 6|wangxiao123|
| 0|wangxiao123| 32| wangxiao123| no| 2| 17| 5|wangxiao123|
| 0| female| 22| wangxiao123| no| 2| 12| 1|wangxiao123|
| 0| male| 57| 15| yes| 2| 14| 4| 4|
| 0| female| 32| 15| yes| 4| 16| 1| 2|
| 0| male| 22| 1.5| no| 4| 14| 4| 5|
+-------+-----------+---+------------+--------+-------------+---------+----------+-----------+
// Fill in the control of the specified column -- Multiple columns with the same value
df1.na.fill(123456,cols = Array("courseid","pointlistid")).show(false)
+---------+--------+-------+-----+-----------+--------+----+
|chapterid|courseid|majorid|money|pointlistid|dt |dn |
+---------+--------+-------+-----+-----------+--------+----+
|4 |123456 |5 |100 |3 |20190722|webA|
|7 |123456 |7 |100 |1 |20190722|webA|
|8 |123456 |3 | |8 |20190722|webA|
|5 |14 |3 |100 |123456 |20190722|webA|
|4 |15 |2 |100 |3 |20190722|webA|
|9 |123456 |8 |100 |7 |20190722|webA|
|7 |17 |7 |100 |123456 |20190722|webA|
|0 |18 |9 | |7 |20190722|webA|
|5 |123456 |8 |100 |4 |20190722|webA|
|4 |20 |1 |100 |123456 |20190722|webA|
|4 |123456 |5 |100 |1 |20190722|webA|
|0 |22 |3 |100 |9 |20190722|webA|
|1 |123456 |8 |100 |0 |20190722|webA|
|4 |24 |0 |100 |5 |20190722|webA|
|9 |123456 |9 |100 |0 |20190722|webA|
+---------+--------+-------+-----+-----------+--------+----+
// Fill in the control of the specified column -- Multiple columns of different values
df1.na.fill(Map("courseid"->123456,"pointlistid"->654321)).show(false)
+---------+--------+-------+-----+-----------+--------+----+
|chapterid|courseid|majorid|money|pointlistid|dt |dn |
+---------+--------+-------+-----+-----------+--------+----+
|4 |123456 |5 |100 |3 |20190722|webA|
|7 |123456 |7 |100 |1 |20190722|webA|
|8 |123456 |3 | |8 |20190722|webA|
|5 |14 |3 |100 |654321 |20190722|webA|
|4 |15 |2 |100 |3 |20190722|webA|
|9 |123456 |8 |100 |7 |20190722|webA|
|7 |17 |7 |100 |654321 |20190722|webA|
|0 |18 |9 | |7 |20190722|webA|
|5 |123456 |8 |100 |4 |20190722|webA|
|4 |20 |1 |100 |654321 |20190722|webA|
|4 |123456 |5 |100 |1 |20190722|webA|
|0 |22 |3 |100 |9 |20190722|webA|
|1 |123456 |8 |100 |0 |20190722|webA|
|4 |24 |0 |100 |5 |20190722|webA|
|9 |123456 |9 |100 |0 |20190722|webA|
+---------+--------+-------+-----+-----------+--------+----+
// Query null column
data1.filter("gender is null").select("gender").limit(10).show
+------+
|gender|
+------+
| null|
| null|
| null|
| null|
| null|
+------+
data1.filter("gender is not null").select("gender").limit(10).show
+------+
|gender|
+------+
| male|
|female|
| male|
|female|
| male|
| male|
| male|
| male|
|female|
|female|
+------+
data1.filter( data1("gender").isNull ).select("gender").limit(10).show
+------+
|gender|
+------+
| null|
| null|
| null|
| null|
| null|
+------+
data1.filter("gender<>''").select("gender").limit(10).show
+------+
|gender|
+------+
| male|
|female|
| male|
|female|
| male|
| male|
| male|
| male|
|female|
|female|
+------+
边栏推荐
- 如何解决ecology9.0执行导入流程流程产生的问题
- Yolov5, pychar, Anaconda environment installation
- Pointer pointer array, array pointer
- Transport layer protocol ----- UDP protocol
- 《编程之美》读书笔记
- notepad++正則錶達式替換字符串
- Key structure of ffmpeg - avframe
- Pointer - character pointer
- 2022.7.5-----leetcode. seven hundred and twenty-nine
- Date类中日期转成指定字符串出现的问题及解决方法
猜你喜欢

notepad++正則錶達式替換字符串

关于slmgr命令的那些事

【DesignMode】组合模式(composite mode)

Leetcode:20220213 week race (less bugs, top 10% 555)

Huawei equipment configuration ospf-bgp linkage

剖面测量之提取剖面数据

MySql——CRUD

Date类中日期转成指定字符串出现的问题及解决方法
![[binary search tree] add, delete, modify and query function code implementation](/img/38/810a83575c56f17a7a0ed428a2e02e.png)
[binary search tree] add, delete, modify and query function code implementation

OpenCV经典100题
随机推荐
FPGA内部硬件结构与代码的关系
Global and Chinese markets for pressure and temperature sensors 2022-2028: Research Report on technology, participants, trends, market size and share
免费的聊天机器人API
There is no network after configuring the agent by capturing packets with Fiddler mobile phones
Opencv classic 100 questions
数据分析思维分析方法和业务知识——分析方法(二)
Huawei equipment is configured with OSPF and BFD linkage
Configuring OSPF GR features for Huawei devices
Introduction of motor
Yolov5, pychar, Anaconda environment installation
数据分析思维分析方法和业务知识——分析方法(三)
2022-02-13 work record -- PHP parsing rich text
Go learning - dependency injection
NSSA area where OSPF is configured for Huawei equipment
STM32 configuration after chip replacement and possible errors
Start from the bottom structure and learn the introduction of fpga---fifo IP core and its key parameters
An understanding of & array names
Pointer pointer array, array pointer
小程序技术优势与产业互联网相结合的分析
Choose to pay tribute to the spirit behind continuous struggle -- Dialogue will values [Issue 4]