当前位置:网站首页>解决报错TypeError:unsupported operand type(s) for +: ‘NoneType‘ and ‘str‘
解决报错TypeError:unsupported operand type(s) for +: ‘NoneType‘ and ‘str‘
2022-07-31 10:13:00 【山顶夕景】
一、问题描述
from pyspark.sql.types import StringType
@udf(returnType = StringType())
def bad_funify(s):
return s + " is fun!"
countries2 = spark.createDataFrame([("Thailand", 3), (None, 4)], ["country", "id"])
countries2.withColumn("fun_country", bad_funify("country")).show()
用一个udf想让df(有country和id两个字段)生成新的一列fun_country
(内容是字符串,内容为country xx is fun
),但是df中有的country
字段内容没有数据(注意类型是None
而不是null
),结果报错如下:
PythonException:
An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 619, in main
process()
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 611, in process
serializer.dump_stream(out_iter, outfile)
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/serializers.py", line 211, in dump_stream
self.serializer.dump_stream(self._batched(iterator), stream)
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/serializers.py", line 132, in dump_stream
for obj in iterator:
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/serializers.py", line 200, in _batched
for item in iterator:
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 452, in mapper
result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 452, in <genexpr>
result = tuple(f(*[a[o] for o in arg_offsets]) for (arg_offsets, f) in udfs)
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/worker.py", line 87, in <lambda>
return lambda *a: f(*a)
File "/usr/lib/spark-current/python/lib/pyspark.zip/pyspark/util.py", line 74, in wrapper
return f(*args, **kwargs)
File "<ipython-input-1051-5a6c51e7c332>", line 5, in bad_funify
TypeError: unsupported operand type(s) for +: 'NoneType' and 'str'
二、解决方案
这是个很蠢的问题。其实如果country
为空值时,fun_country
应该也是空的,所以就简单加多个判断的逻辑即可。修改udf为good_funity
后:
@udf(returnType=StringType())
def good_funify(s):
return None if s == None else s + " is fun!"
countries2.withColumn("fun_country", good_funify("country")).show()
+--------+---+----------------+
| country| id| fun_country|
+--------+---+----------------+
|Thailand| 3|Thailand is fun!|
| null| 4| null|
+--------+---+----------------+
Reference
边栏推荐
- 解决rpc error: code = Unimplemented desc = method CheckLicense not implemented
- 【23提前批】北森云计算-测开面经
- 逆置问题--重点
- Build finished with errors/Executable Not Found
- 湖仓一体电商项目(二):项目使用技术及版本和基础环境准备
- loadrunner-controller-view script与load generator
- odoo14 | 附件上传功能及实际使用
- Kotlin—基本语法(三)
- Come n times - 09. Implement queues with two stacks
- Progressive Web App(PWA)
猜你喜欢
随机推荐
【LeetCode】387. 字符串中的第一个唯一字符
WEB核心【记录网站登录人数,记录用户名案例】Cookie技术实现
【LeetCode】141.环形链表
数据中台建设(六):数据体系建设
Implement a thread pool
前序、后序及层次遍历实现二叉树的序列化与反序列化
SQL存储过程详解
The fifth chapter
如何在 TiDB Cloud 上使用 Databricks 进行数据分析 | TiDB Cloud 使用指南
【LeetCode】242. 有效的字母异位词
loadrunner-controller-目标场景Schedule配置
透过开发抽奖小程序,体会创新与迭代
【LeetCode】Day108-和为 K 的子数组
csdn file export to pdf
业务-(课程-章节-小节)+课程发布一些业务思路
通过栗子来学习MySQL高级知识点(学习,复习,面试都可)
ASP.NET 身份认证框架 Identity(一)
Dart Log工具类
因存在自燃安全隐患,宝马7系和5系紧急召回,合计超过5.7万辆
小程序如何使用订阅消息(PHP代码+小程序js代码)