当前位置:网站首页>Operator operation list of spark
Operator operation list of spark
2022-07-29 04:56:00 【Alex_ 81D】
Action operation
1、 collect() , The return value is an array , return dataframe Collect all rows
2、 collectAsList() The return value is one java An array of types , return dataframe Collect all rows
3、 count() Return to one number Type of , return dataframe Number of rows in the collection
4、 describe(cols: String*) Returns a mathematically calculated class table value (count, mean, stddev, min, and max), This can pass multiple parameters , Separated by commas , If any field is empty , So don't participate in the operation , Only this pair of numeric fields . for example df.describe("age", "height").show()
5、 first() Go back to the first line , The type is row type
6、 head() Go back to the first line , The type is row type
7、 head(n:Int) return n That's ok , The type is row type
8、 show() return dataframe The value of the set The default is 20 That's ok , The return type is unit
9、 show(n:Int) return n That's ok ,, The return value type is unit
10、 table(n:Int) return n That's ok , The type is row type
dataframe Basic operation
1、 cache() Memory for synchronizing data
2、 columns Return to one string An array of types , The return value is the name of all columns
3、 dtypes Return to one string Two dimensional array of type , The return value is the name and type of all columns
4、 explan() Print the execution plan Physical
5、 explain(n:Boolean) The input value is false perhaps true , The return value is unit The default is false , If input true Will print Logical and physical
6、 isLocal The return value is Boolean type , If the allowed mode is local return true Otherwise return to false
7、 persist(newlevel:StorageLevel) Return to one dataframe.this.type Enter the storage model type
8、 printSchema() Print out the field name and type Print in a tree structure
9、 registerTempTable(tablename:String) return Unit , take df The objects of are only placed in one table , This table is deleted with the deletion of objects
10、 schema return structType type , Return the field name and type according to the structure type
11、 toDF() Back to a new dataframe Type of
12、 toDF(colnames:String*) Return several fields in the parameter to a new dataframe Type of ,
13、 unpersist() return dataframe.this.type type , Remove the data in the pattern
14、 unpersist(blocking:Boolean) return dataframe.this.type type true and unpersist It's the same thing false It's the removal of RDD
Integrated query :
1、 agg(expers:column*) return dataframe type , Same as mathematical calculation and evaluation
df.agg(max("age"), avg("salary"))
df.groupBy().agg(max("age"), avg("salary"))
2、 agg(exprs: Map[String, String]) return dataframe type , Same as mathematical calculation and evaluation map Type of
df.agg(Map("age" -> "max", "salary" -> "avg"))
df.groupBy().agg(Map("age" -> "max", "salary" -> "avg"))
3、 agg(aggExpr: (String, String), aggExprs: (String, String)*) return dataframe type , Same as mathematical calculation and evaluation
df.agg(Map("age" -> "max", "salary" -> "avg"))
df.groupBy().agg(Map("age" -> "max", "salary" -> "avg"))
4、 apply(colName: String) return column type , Capture the objects entered into the column
5、 as(alias: String) Back to a new dataframe type , It's the original alias
6、 col(colName: String) return column type , Capture the objects entered into the column
7、 cube(col1: String, cols: String*) Return to one GroupedData type , Summarize according to certain fields
8、 distinct duplicate removal Return to one dataframe type
9、 drop(col: Column) Delete a column return dataframe type
10、 dropDuplicates(colNames: Array[String]) Delete the same column Return to one dataframe
11、 except(other: DataFrame) Return to one dataframe, Returns what exists in the current collection but does not exist in other collections
12、 explode[A, B](inputColumn: String, outputColumn: String)(f: (A) ⇒ TraversableOnce[B])(implicit arg0: scala.reflect.api.JavaUniverse.TypeTag[B]) The return value is dataframe type , This Split a field into more rows
df.explode("name","names") {name :String=> name.split(" ")}.show();
take name Fields are split by spaces , The split fields are placed in names Inside
13、 filter(conditionExpr: String): Brush some data , return dataframe type df.filter("age>10").show(); df.filter(df("age")>10).show(); df.where(df("age")>10).show(); Fine
14、 groupBy(col1: String, cols: String*) Summarize and return according to a write field groupedate type df.groupBy("age").agg(Map("age" ->"count")).show();df.groupBy("age").avg().show(); Fine
15、 intersect(other: DataFrame) Return to one dataframe, stay 2 individual dataframe The elements that exist
16、 join(right: DataFrame, joinExprs: Column, joinType: String)
One is related dataframe, The second condition of Association , The third type of Association :inner, outer, left_outer, right_outer, leftsemi
df.join(ds,df("name")===ds("name") and df("age")===ds("age"),"outer").show();
17、 limit(n: Int) return dataframe type Go to n Data out
18、 na: DataFrameNaFunctions , You can call dataframenafunctions Filter the function area of df.na.drop().show(); Delete empty rows
19、 orderBy(sortExprs: Column*) do alise Sort
20、 select(cols:string*) dataframe Swipe the fields df.select($"colA", $"colB" + 1)
21、 selectExpr(exprs: String*) Swipe the fields df.selectExpr("name","name as names","upper(name)","age+1").show();
22、 sort(sortExprs: Column*) Sort df.sort(df("age").desc).show(); The default is asc
23、 unionAll(other:Dataframe) Merge df.unionAll(ds).show();
24、 withColumnRenamed(existingName: String, newName: String) Modify the list df.withColumnRenamed("name","names").show();
25、 withColumn(colName: String, col: Column) Add a row df.withColumn("aa",df("name")).show();
边栏推荐
- JS daily question (10)
- Excel怎么筛选出自己想要的内容?excel表格筛选内容教程
- Improve the readability of your regular expressions a hundred times
- 删除word文档中的空白页
- 【微信小程序】swiper滑动页面,滑块左右各露出前后的一部分,露出一部分
- Mysql各版本下载地址及多版本共存安装
- Word如何查看文档修改痕迹?Word查看文档修改痕迹的方法
- 荣耀2023内推,内推码ambubk
- Use openmap and ArcGIS to draw maps and transportation networks of any region, and convert OMS data into SHP format
- GCC Basics
猜你喜欢
随机推荐
Introduction to auto.js script development
盒子水平垂直居中布局(总结)
VScode配置makefile编译
Download addresses of various versions of MySQL and multi version coexistence installation
How to avoid damage of oscilloscope current probe
Use openmap and ArcGIS to draw maps and transportation networks of any region, and convert OMS data into SHP format
RecyclerView通过DPAD按键上下切换焦点 切换到界面外的控件时焦点会左右乱跳
Review key points and data sorting of information metrology in the second semester of 2022 (teacher zhaorongying of Wuhan University)
电脑无法打开excel表格怎么办?excel打不开的解决方法
Mujoco and mujoco_ Install libxcursor.so 1:NO such dictionary
Software test interview questions (4)
使用近场探头和电流探头进行EMI干扰排查
SparkSql批量插入或更新,保存数据到Mysql中
excel怎么设置行高和列宽?excel设置行高和列宽的方法
C language implementation of three chess
虚拟偶像的歌声原来是这样生成的!
Quick start JDBC
输入的查询SQL语句,是如何执行的?
Makefile+Make基础知识
img 响应式图片的实现(含srcset属性、sizes属性的使用方法,设备像素比详解)









