当前位置:网站首页>Animation scoring data analysis and visualization and it industry recruitment data analysis and visualization
Animation scoring data analysis and visualization and it industry recruitment data analysis and visualization
2022-07-05 05:24:00 【Data ape from zero】
Data visualization course design
1, Animation scoring data analysis and visualization
2,IT Industry recruitment data analysis and visualization
1, Animation scoring data analysis and visualization
1.1 Data capture
Upload the captured file to ${HIVE_HOME}/mydata Under the table of contents
1.2 Hive Table creation and import
1.2.1 establish cartoon_info Table and import the data
CREATE EXTERNAL TABLE Json( data string )
Load data to Json Spare in the table
load data local inpath 'mydata/infos_total.json' overwrite into table Json;
establish cartoon_info surface
drop table if exists cartoon_info; CREATE EXTERNAL TABLE cartoon_info( `ssid` string, `cartoon` string, `views` bigint, `coins` int, `follow` int, `series_follow` int, `danmakus` int, `likes` int, `favorite` int, `favorites` int, `reply` int, `share` int, `cover` string, `url` string, `episodes` int, `count` int, `is_finish` int, `pub_time` TIMESTAMP, `media_tags` string, `voice_actor` string, `score` float ) stored as parquet location '/warehouse/cartoon_info';
Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )
insert overwrite table cartoon_info select json_tuple(json,'ssid' ,'cartoon' ,'views' ,'coins' ,'follow' ,'series_follow' ,'danmakus' ,'likes' ,'favorite' ,'favorites' ,'reply' ,'share' ,'cover' ,'url','episodes' ,'count' ,'is_finish' ,'pub_time','media_tags','voice_actor','score') from ( select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\{' ) ,'\\;')) as json from Json )a;
1.2.2 establish cartoon_comments surface
CREATE EXTERNAL TABLE Json2( data string );
Load data to Json2 Spare in the table
load data local inpath 'mydata/comments_total.json' overwrite into table Json2;
establish cartoon_comments Table and import the data
drop table if exists cartoon_comments; CREATE EXTERNAL TABLE cartoon_comments( `mid` string, `uname` string, `ssid` string, `message` string, `like` int, `dt` timestamp ) stored as parquet location '/warehouse/cartoon_comments';
Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )
insert overwrite table cartoon_comments select json_tuple(json,'mid' ,'uname' ,'ssid' ,'message' ,'like' ,'dt' ) from (select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\;\\;\\{' ) ,'\\;\\;\\;')) as json from Json2)a;
Two IT Industry recruitment data analysis and visualization
1.1 Data capture
1, You need to log in to dragnet !! Please pay attention to replacing individuals Cookie And Cookie Don't have Chinese , Otherwise, an error will be reported ; If Cookie Don't take effect , Please open other pages of dragnet to get Cookie.
2, If an error is reported, please open dragnet to check whether it needs to be verified
Upload the captured file to ${HIVE_HOME}/mydata Under the table of contents
2.1 Hive Table creation and import
CREATE EXTERNAL TABLE Json3( data string )
Load data to Json3 Spare in the table
load data local inpath 'mydata/jobsInfo.json' overwrite into table Json3;
2.1.1 establish jobs_info Table and import the data
drop table if exists jobs_info; CREATE EXTERNAL TABLE jobs_info( `job` string, `keyword` string, `place` string, `requirement` string, `salary` string, `tags` string, `welfare` string, `pubtime` date ) stored as parquet location '/warehouse/jobs_info';
Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )
insert overwrite table jobs_info select json_tuple(json,'job' ,'keyword' ,'place' ,'requirement' ,'salary' ,'tags' ,'welfare' ,'pubtime') from ( select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\{' ) ,'\\;')) as json from Json3 )a;
3, Data analysis and Visualization
3.1 Pyhive Connect Hive course :
Python install sasl,thrift,thrift-sasl And connect PyHive
Connection code : Pyhive
3.2 Data analysis and Visualization
Install the necessary packages
pip install pandas==0.23.4 pip install pyecharts==1.9.1 pip install matplotlib==3.5.1 pip install numpy==1.18.5 pip install jieba==0.42.1 pip install squarify==0.4.3
1, Animation scoring data analysis and visualization Data analysis code :bilibili
The code contains [" Rose chart "," Clouds of words "," Radar map "," Scatter plot "," Funnel diagram "," Ring graph "," Bar chart "," Tree diagram "," Matchstick "," Subgraphs "] common 10 Types of Graphs , Contains 4 individual matplotlib Figure and 6 individual pyecharts A simple analysis of the graph .
2,IT Industry recruitment data analysis and visualization Data analysis code :IT
The code contains [" Rose chart "," Clouds of words "," Pictogram "," Scatter plot "," Funnel diagram "," Ring graph "," Bar chart "," Tree diagram "," Matchstick "," Subgraphs "] common 10 Types of Graphs , Contains 4 individual matplotlib Figure and 6 individual pyecharts A simple analysis of the graph .
边栏推荐
- Haut OJ 2021 freshmen week II reflection summary
- Ue4/ue5 illusory engine, material chapter, texture, compression and memory compression and memory
- Count sort
- Drawing dynamic 3D circle with pure C language
- [speed pointer] 142 circular linked list II
- BUUCTF MISC
- 《动手学深度学习》学习笔记
- 被舆论盯上的蔚来,何时再次“起高楼”?
- 软件测试 -- 0 序
- Programmers' experience of delivering takeout
猜你喜欢
GBase数据库助力湾区数字金融发展
Ue4/ue5 illusory engine, material part (III), material optimization at different distances
质量体系建设之路的分分合合
To be continued] [UE4 notes] L4 object editing
Optimization scheme of win10 virtual machine cluster
To the distance we have been looking for -- film review of "flying house journey"
[to be continued] [depth first search] 547 Number of provinces
Support multi-mode polymorphic gbase 8C database continuous innovation and heavy upgrade
UE fantasy engine, project structure
Django reports an error when connecting to the database. What is the reason
随机推荐
Find a good teaching video for Solon framework test (Solon, lightweight application development framework)
SAP-修改系统表数据的方法
【论文笔记】Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Insert sort
【ES实战】ES上的native realm安全方式使用
win10虚拟机集群优化方案
Research on the value of background repeat of background tiling
A preliminary study of sdei - see the essence through transactions
Pointnet++的改进
Introduction to memory layout of FVP and Juno platforms
[depth first search] 695 Maximum area of the island
Binary search basis
BUUCTF MISC
Solon Logging 插件的添加器级别控制和日志器的级别控制
远程升级怕截胡?详解FOTA安全升级
[paper notes] multi goal reinforcement learning: challenging robotics environments and request for research
[allocation problem] 135 Distribute candy
2022/7/1 learning summary
National teacher qualification examination in the first half of 2022
质量体系建设之路的分分合合