当前位置:网站首页>Spark SQL chasing Wife Series (initial understanding)
Spark SQL chasing Wife Series (initial understanding)
2022-07-06 20:36:00 【Several storehouses of cabbage white】
Today is another day , Another day like you
Small talk
I haven't written an article for a long time , Although four or five days have passed , Writing an article today is very simple , Itchy hands are trying to write an article . Today's article is about Spark SQL Series of articles , About the last wife chasing series RDD Programming , Didn't give up , I will still write . The basic action operator and transformation operator have been written . Then there are accumulators, partitions, and other data types .
Today's update is Spark SQL, Why not write RDD Well , It's simple ,SQL To write well , Look for a job
Spark SQL What is it?
If nothing happens ,Spark SQL Will appear in the blog for a long time .
Let's first introduce what is Spark SQL Well
Spark SQl The predecessor was Shark, at that time Hadoop There are Hive, It can be used Hsql To replace mr Program to complete data analysis , Very convenient , The difficulty of development is greatly reduced , At that time Spark The ecosystem does not , So the predecessor was created Shark, later Shark No maintenance , Comprehensive in Spark Above to achieve Shark. That's what we have now Spark SQL.
Although I say Spark It has its own ecosystem , however Spark Most of it is in Hdfs above . As I said before ,MR Out of date , however hdfs This storage system is not out of date ,Spark It's just hdfs What's going on above .
Spark SQL Can do
After all, SQL Boy, Then from ETL To explain Spark SQL What can I do
- extract (Extract):Spark SQL From the file system (HDFS, The local system ), Get data from relational database or non relational data .Spark SQL The supported file types are csv,json,xml,Parquet,ORC,Avro etc. .
- transformation (transform): It is called data cleaning
- load (load): The processed data can be stored in different data sources .
Spark It is mainly used to deal with structured data , What is structured data ?
Structured data refers to a data set in which the record content has clear structural information and each record in the data set conforms to the structural specification , A data set logically expressed and implemented by a two-dimensional table structure . for instance , It refers to the fields, attributes, types and other information of the relational database .
Spark SQL Key points of
Data Frame. Can pass Data Frame Of API To analyze the data , be called DSL.
meanwhile , You can also put Data Frame Registration form , And then use SQL perhaps Hive SQL To do data analysis . For skilled use SQL We can get started more quickly
Now that we're here Data Frame, So let's introduce
Learn before RDD When ,RDD It is a collection of data , But I don't know what each piece of data is ,Data Frame Not so ,Data Frame It clearly stipulates that each piece of data consists of several named fields . For image comparison , To see pictures

RDD Only know what is stored Person Object of type ,Data Frame What's stored is each Person Object information .Data Frame Naked swimming ,RDD It's like wearing clothes . A glance , I can't see anything
Spark Session
Spark SQL The starting point of programming is Spark Session, Just beginning to learn Spark Core When , The entrance of programming is spark conf. Now? Spark SQL The entrance has changed .
Spark Session You can create Data Frame object , You can read external files and pass SQL Perform query analysis .

spark conf: Create context objects
spark Session: establish Spark Session object
establish Data Frame
By reading the json File to create df object . Have a look first json file

Start performing
Read people.json File to create Data Frame object
val sparkConf = new SparkConf().setMaster("local[*]").setAppName("Spark_Sql")
val sparkSession = SparkSession.builder().config(sparkConf).getOrCreate()
import sparkSession.implicits._
// Read json file
val frame = sparkSession.read.json("date/people.json")How to display the data of this file ?
frame.show()

You can see , And here it is , Just like the result of table query
summary
It will be updated tomorrow Spark SQL Of DSL and Sql Statement .
Update here today
I hope my subject 4 can be booked successfully
边栏推荐
- Appx code signing Guide
- BeagleBoneBlack 上手记
- 看过很多教程,却依然写不好一个程序,怎么破?
- C language games - minesweeping
- How to upgrade high value-added links in the textile and clothing industry? APS to help
- Solution to the 38th weekly match of acwing
- Is it difficult for small and micro enterprises to make accounts? Smart accounting gadget quick to use
- Detailed introduction of distributed pressure measurement system VIII: basic introduction of akka actor model
- 小孩子学什么编程?
- 电子游戏的核心原理
猜你喜欢

HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother

【GET-4】

use. Net drives the OLED display of Jetson nano

rt-thread i2c 使用教程

use. Net analysis Net talent challenge participation

The mail command is used in combination with the pipeline command statement

Entity alignment two of knowledge map

Value of APS application in food industry

Anaconda安装后Jupyter launch 没反应&网页打开运行没执行
![[Yann Lecun likes the red stone neural network made by minecraft]](/img/95/c3af40c7ecbd371dd674aea19b272a.png)
[Yann Lecun likes the red stone neural network made by minecraft]
随机推荐
[weekly pit] output triangle
Learn to punch in Web
【每周一坑】正整数分解质因数 +【解答】计算100以内质数之和
C language games - minesweeping
Mécanisme de fonctionnement et de mise à jour de [Widget Wechat]
[Yann Lecun likes the red stone neural network made by minecraft]
Activiti global process monitors activitieventlistener to monitor different types of events, which is very convenient without configuring task monitoring in acitivit
JMeter server resource indicator monitoring (CPU, memory, etc.)
JS get browser system language
Tips for web development: skillfully use ThreadLocal to avoid layer by layer value transmission
C language games - three chess
【每周一坑】计算100以内质数之和 +【解答】输出三角形
Implementation of packaging video into MP4 format and storing it in TF Card
Use of OLED screen
HMS core machine learning service creates a new "sound" state of simultaneous interpreting translation, and AI makes international exchanges smoother
Unity making plug-ins
2022 refrigeration and air conditioning equipment installation and repair examination contents and new version of refrigeration and air conditioning equipment installation and repair examination quest
Extraction rules and test objectives of performance test points
Jupyter launch didn't respond after Anaconda was installed & the web page was opened and ran without execution
棋盘左上角到右下角方案数(2)