当前位置:网站首页>Animation scoring data analysis and visualization and it industry recruitment data analysis and visualization

Animation scoring data analysis and visualization and it industry recruitment data analysis and visualization

2022-07-05 05:24:00 Data ape from zero

Data visualization course design

1, Animation scoring data analysis and visualization

Visual address preview

2,IT Industry recruitment data analysis and visualization

Visual address preview

1, Animation scoring data analysis and visualization

1.1 Data capture

BilibiliSpider

Upload the captured file to ${HIVE_HOME}/mydata Under the table of contents

1.2 Hive Table creation and import

Hive Table field information

1.2.1 establish cartoon_info Table and import the data

CREATE EXTERNAL TABLE Json(
 data string
)

Load data to Json Spare in the table

load data local inpath 'mydata/infos_total.json' overwrite into table Json;

establish cartoon_info surface

drop table if exists cartoon_info;
CREATE EXTERNAL TABLE cartoon_info(
`ssid` string,
`cartoon` string,
`views` bigint,
`coins` int,
`follow` int,
`series_follow` int,
`danmakus` int,
`likes` int,
`favorite` int,
`favorites` int,
`reply` int,
`share` int,
`cover` string,
`url` string,
`episodes` int,
`count` int,
`is_finish` int,
`pub_time` TIMESTAMP,
`media_tags` string,
`voice_actor` string,
`score` float
)
stored as parquet
location '/warehouse/cartoon_info';

Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )

insert overwrite table cartoon_info

select json_tuple(json,'ssid' ,'cartoon' ,'views' ,'coins' ,'follow' ,'series_follow' ,'danmakus' ,'likes' ,'favorite' ,'favorites' ,'reply' ,'share' ,'cover' ,'url','episodes' ,'count' ,'is_finish' ,'pub_time','media_tags','voice_actor','score') from (
select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\{' )  ,'\\;'))  as json from Json
)a;

1.2.2 establish cartoon_comments surface

CREATE EXTERNAL TABLE Json2(
 data string
);

Load data to Json2 Spare in the table

load data local inpath 'mydata/comments_total.json' overwrite into table Json2;

establish cartoon_comments Table and import the data

drop table if exists cartoon_comments;
CREATE EXTERNAL TABLE cartoon_comments(
`mid` string,
`uname` string,
`ssid` string,
`message` string,
`like` int,
`dt` timestamp
)
stored as parquet
location '/warehouse/cartoon_comments';

Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )

insert overwrite table cartoon_comments

select json_tuple(json,'mid' ,'uname' ,'ssid' ,'message' ,'like' ,'dt' ) from (select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\;\\;\\{' )  ,'\\;\\;\\;')) as json from Json2)a;

Two IT Industry recruitment data analysis and visualization

1.1 Data capture

ITJobSpider

1, You need to log in to dragnet !! Please pay attention to replacing individuals Cookie And Cookie Don't have Chinese , Otherwise, an error will be reported ; If Cookie Don't take effect , Please open other pages of dragnet to get Cookie.

2, If an error is reported, please open dragnet to check whether it needs to be verified

Upload the captured file to ${HIVE_HOME}/mydata Under the table of contents

2.1 Hive Table creation and import

Hive Table field information

CREATE EXTERNAL TABLE Json3(
 data string
)

Load data to Json3 Spare in the table

load data local inpath 'mydata/jobsInfo.json' overwrite into table Json3;

2.1.1 establish jobs_info Table and import the data

drop table if exists jobs_info;
CREATE EXTERNAL TABLE jobs_info(
`job` string,
`keyword` string,
`place` string,
`requirement` string,
`salary` string,
`tags` string,
`welfare` string,
`pubtime` date
)
stored as parquet
location '/warehouse/jobs_info';

Use Json Parsing insert data , Details please see : Hive And Json analysis ( Ordinary Json and Json Array )

insert overwrite table jobs_info

select json_tuple(json,'job' ,'keyword' ,'place' ,'requirement' ,'salary' ,'tags' ,'welfare' ,'pubtime') from (
select explode(split(regexp_replace(regexp_replace(data,'\\[|\\]',''),'\\}\\, \\{','\\}\\;\\{' )  ,'\\;'))  as json from Json3
)a;

3, Data analysis and Visualization

3.1 Pyhive Connect Hive course :

Python install sasl,thrift,thrift-sasl And connect PyHive

Connection code : Pyhive

3.2 Data analysis and Visualization

Install the necessary packages

pip install pandas==0.23.4
pip install pyecharts==1.9.1
pip install matplotlib==3.5.1
pip install numpy==1.18.5
pip install jieba==0.42.1
pip install squarify==0.4.3

1, Animation scoring data analysis and visualization Data analysis code :bilibili

The code contains [" Rose chart "," Clouds of words "," Radar map "," Scatter plot "," Funnel diagram "," Ring graph "," Bar chart "," Tree diagram "," Matchstick "," Subgraphs "] common 10 Types of Graphs , Contains 4 individual matplotlib Figure and 6 individual pyecharts A simple analysis of the graph .

2,IT Industry recruitment data analysis and visualization Data analysis code :IT

The code contains [" Rose chart "," Clouds of words "," Pictogram "," Scatter plot "," Funnel diagram "," Ring graph "," Bar chart "," Tree diagram "," Matchstick "," Subgraphs "] common 10 Types of Graphs , Contains 4 individual matplotlib Figure and 6 individual pyecharts A simple analysis of the graph .

原网站

版权声明
本文为[Data ape from zero]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/186/202207050519334686.html