当前位置:网站首页>Animation scoring data analysis and visualization and it industry recruitment data analysis and visualization
Animation scoring data analysis and visualization and it industry recruitment data analysis and visualization
2022-07-05 05:24:00 【Data ape from zero】
Data visualization course design
1, Animation scoring data analysis and visualization
2,IT Industry recruitment data analysis and visualization
1, Animation scoring data analysis and visualization
1.1 Data capture
Upload the captured file to ${HIVE_HOME}/mydata Under the table of contents
1.2 Hive Table creation and import
1.2.1 establish cartoon_info Table and import the data
CREATE EXTERNAL TABLE Json( data string )
Load data to Json Spare in the table
load data local inpath 'mydata/infos_total.json' overwrite into table Json;
establish cartoon_info surface
drop table if exists cartoon_info; CREATE EXTERNAL TABLE cartoon_info( `ssid` string, `cartoon` string, `views` bigint, `coins` int, `follow` int, `series_follow` int, `danmakus` int, `likes` int, `favorite` int, `favorites` int, `reply` int, `share` int, `cover` string, `url` string, `episodes` int, `count` int, `is_finish` int, `pub_time` TIMESTAMP, `media_tags` string, `voice_actor` string, `score` float ) stored as parquet location '/warehouse/cartoon_info';
Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )
insert overwrite table cartoon_info
select json_tuple(json,'ssid' ,'cartoon' ,'views' ,'coins' ,'follow' ,'series_follow' ,'danmakus' ,'likes' ,'favorite' ,'favorites' ,'reply' ,'share' ,'cover' ,'url','episodes' ,'count' ,'is_finish' ,'pub_time','media_tags','voice_actor','score') from (
select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\{' ) ,'\\;')) as json from Json
)a;1.2.2 establish cartoon_comments surface
CREATE EXTERNAL TABLE Json2( data string );
Load data to Json2 Spare in the table
load data local inpath 'mydata/comments_total.json' overwrite into table Json2;
establish cartoon_comments Table and import the data
drop table if exists cartoon_comments; CREATE EXTERNAL TABLE cartoon_comments( `mid` string, `uname` string, `ssid` string, `message` string, `like` int, `dt` timestamp ) stored as parquet location '/warehouse/cartoon_comments';
Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )
insert overwrite table cartoon_comments
select json_tuple(json,'mid' ,'uname' ,'ssid' ,'message' ,'like' ,'dt' ) from (select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\;\\;\\{' ) ,'\\;\\;\\;')) as json from Json2)a;Two IT Industry recruitment data analysis and visualization
1.1 Data capture
1, You need to log in to dragnet !! Please pay attention to replacing individuals Cookie And Cookie Don't have Chinese , Otherwise, an error will be reported ; If Cookie Don't take effect , Please open other pages of dragnet to get Cookie.
2, If an error is reported, please open dragnet to check whether it needs to be verified
Upload the captured file to ${HIVE_HOME}/mydata Under the table of contents
2.1 Hive Table creation and import
CREATE EXTERNAL TABLE Json3( data string )
Load data to Json3 Spare in the table
load data local inpath 'mydata/jobsInfo.json' overwrite into table Json3;
2.1.1 establish jobs_info Table and import the data
drop table if exists jobs_info; CREATE EXTERNAL TABLE jobs_info( `job` string, `keyword` string, `place` string, `requirement` string, `salary` string, `tags` string, `welfare` string, `pubtime` date ) stored as parquet location '/warehouse/jobs_info';
Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )
insert overwrite table jobs_info
select json_tuple(json,'job' ,'keyword' ,'place' ,'requirement' ,'salary' ,'tags' ,'welfare' ,'pubtime') from (
select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\{' ) ,'\\;')) as json from Json3
)a;3, Data analysis and Visualization
3.1 Pyhive Connect Hive course :
Python install sasl,thrift,thrift-sasl And connect PyHive
Connection code : Pyhive
3.2 Data analysis and Visualization
Install the necessary packages
pip install pandas==0.23.4 pip install pyecharts==1.9.1 pip install matplotlib==3.5.1 pip install numpy==1.18.5 pip install jieba==0.42.1 pip install squarify==0.4.3
1, Animation scoring data analysis and visualization Data analysis code :bilibili
The code contains [" Rose chart "," Clouds of words "," Radar map "," Scatter plot "," Funnel diagram "," Ring graph "," Bar chart "," Tree diagram "," Matchstick "," Subgraphs "] common 10 Types of Graphs , Contains 4 individual matplotlib Figure and 6 individual pyecharts A simple analysis of the graph .
2,IT Industry recruitment data analysis and visualization Data analysis code :IT
The code contains [" Rose chart "," Clouds of words "," Pictogram "," Scatter plot "," Funnel diagram "," Ring graph "," Bar chart "," Tree diagram "," Matchstick "," Subgraphs "] common 10 Types of Graphs , Contains 4 individual matplotlib Figure and 6 individual pyecharts A simple analysis of the graph .
边栏推荐
- Haut OJ 2021 freshmen week II reflection summary
- Es module and commonjs learning notes
- [turn]: Apache Felix framework configuration properties
- A new micro ORM open source framework
- Acwing 4300. Two operations
- Summary of Haut OJ 2021 freshman week
- UE fantasy engine, project structure
- Haut OJ 1321: mode problem of choice sister
- Embedded database development programming (VI) -- C API
- 2022年上半年国家教师资格证考试
猜你喜欢

Magnifying glass effect

lxml.etree.XMLSyntaxError: Opening and ending tag mismatch: meta line 6 and head, line 8, column 8

Support multi-mode polymorphic gbase 8C database continuous innovation and heavy upgrade

Fragment addition failed error lookup
![[转]MySQL操作实战(一):关键字 & 函数](/img/b1/8b843014f365b786e310718f669043.png)
[转]MySQL操作实战(一):关键字 & 函数

Gbase database helps the development of digital finance in the Bay Area

Grail layout and double wing layout

Ue4/ue5 illusory engine, material chapter, texture, compression and memory compression and memory

嵌入式数据库开发编程(零)

挂起等待锁 vs 自旋锁(两者的使用场合)
随机推荐
Use of room database
Django reports an error when connecting to the database. What is the reason
Development error notes
2022/7/2 question summary
[trans]: spécification osgi
When will Wei Lai, who has been watched by public opinion, start to "build high-rise buildings" again?
Haut OJ 1221: a tired day
YOLOv5添加注意力机制
挂起等待锁 vs 自旋锁(两者的使用场合)
Csp-j-2020-excellent split multiple solutions
[转]MySQL操作实战(一):关键字 & 函数
On-off and on-off of quality system construction
YOLOv5-Shufflenetv2
[merge array] 88 merge two ordered arrays
第六章 数据流建模—课后习题
sync.Mutex源码解读
Haut OJ 1347: addition of choice -- high progress addition
Embedded database development programming (VI) -- C API
Haut OJ 1357: lunch question (I) -- high precision multiplication
Haut OJ 1352: string of choice