当前位置:网站首页>Tianchi - student test score forecast
Tianchi - student test score forecast
2022-06-11 04:47:00 【Panbohhhhh】
Personal learning work , Level co., LTD. , For reference only , Share with you .
1: Purpose
Call library , Familiar with data cleaning , Data processing , be familiar with python Knowledge of programming
The general process is as follows ( It is recommended to memorize , It is good to master knowledge and interview ):
Make sure the dataset itself is available , Including but not limited to :
a) Check whether the data itself is balanced (balanced or not), And deal with
b) Check the data itself for missing values (missing value), And deal with
c) Check whether the data itself has some obvious heterogeneous data (outlier), Deal with it according to the situationExamine the nature of the dataset itself , Determine the appropriate machine learning model (machine learning model)
a) There's a surveillance model (Supervised) VS Unsupervised model (Unsupervised)
b) The regression model (Regression) VS Classification model (Classification)Through data visualization , Build an intuition about data sets (intuition) And cognition (understanding)
Through data visualization , Get a rough idea of the relationship between characteristics and results , Further determine the appropriate machine learning model
Predict and verify the results of future model outputs
Preliminary screening of features used in the model
Characteristic Engineering (feature engineering) Part of the preparation
2: data
data csv About the following : I will upload to the download section , Friends who need to download

3: Start
# Introducing library packages
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.read_csv('student-por.csv')After introducing the library package , Conduct preliminary data processing
print(df.head(10))
print(df.shape)
print(df.isnull().sum())
print(df.describe(include = 'all'))
print(df.info())
The point is , This data set is relatively clean , There is no default ,

The Chinese correspondence of the data segment is as follows
| Field name | meaning | type | describe |
|---|---|---|---|
| sex | Gender | string | F It's a woman. ,M Male |
| address | address | string | U Representing City ,R It means country |
| famsize | Number of family members | string | LE3 Less than three people ,GT3 More than three people |
| pstatus | Whether they live with their parents | string | T Live together ,A Separate |
| medu | Mother's educational level | string | from 0~4 Gradually increase |
| fedu | Father's educational level | string | from 0~4 Gradually increase |
| mjob | Mother's work | string | It is divided into teacher related 、 Health related 、 services |
| fjob | Father's job | string | It is divided into teacher related 、 Health related 、 services |
| guardian | The student's Supervisor | string | mother,father or other |
| traveltime | It takes time from home to school | double | In minutes |
| studytime | Weekly study time | double | In hours |
| failures | Number of failed courses | double | Number of failed courses |
| schoolsup | Are there any additional learning aids | string | yes or no |
| fumsup | Is there a tutor | string | yes or no |
| paid | Whether there is any assistance from relevant examination disciplines | string | yes or no |
| activities | Are there any extracurricular interest classes | string | yes or no |
| higher | Whether there is a willingness to study upwards | string | yes or no |
| internet | Whether the home is connected to the Internet | string | yes or no |
| famrel | Family relationships | double | from 1~5 It means that the relationship goes from bad to good |
| freetime | Amount of spare time | double | from 1~5 From less to more |
| goout | How often do you go out with friends | double | from 1~5 From less to more |
| dalc | Daily drinking capacity | double | from 1~5 From less to more |
| walc | Weekly drinking capacity | double | from 1~5 From less to more |
| health | health | double | from 1~5 From bad to good |
| absences | Attendance | double | 0 To 93 Time |
| G1,G2,G3 | Final grade | double | 20 " |
1; Deal with gender
sns.countplot(x = 'sex', order = ['M','F'], data = df )
df['sex'].replace('M','0')
df['sex'].replace('F','1')
take M- male ,F- Woman Turn into 01

2: Translate addresses
sns.countplot(x = 'address', order = ['U','R'], data = df )
df['address'].replace('U','1')
df['address'].replace('R','0')
Reference:
1:https://tianchi.aliyun.com/course/video?spm=5176.12282042.0.0.3eb22042bd6YRi&liveId=7729
1:https://tianchi.aliyun.com/notebook-ai/detail?spm=5176.12281897.0.0.209439a9IUXP6k&postId=7459
1:https://blog.csdn.net/jiangtianshe/article/details/77703450
边栏推荐
- Leetcode question brushing series - mode 2 (datastructure linked list) - 160:intersection of two linked list
- Redis master-slave replication, sentinel, cluster cluster principle + experiment (wait, it will be later, but it will be better)
- C language test question 3 (program multiple choice question - including detailed explanation of knowledge points)
- PostgreSQL database replication - background first-class citizen process walreceiver receiving and sending logic
- ACTS:高效的测试设计(并赠送一个优秀的测试设计工具)
- Tips and websites for selecting papers
- 梅州植物组培实验室建设资料整理
- Pytoch machine learning GPU usage (conversion from CPU to GPU)
- Ican uses fast r-cnn to get an empty object detection result file
- International qihuo: what are the risks of Zhengda master account
猜你喜欢

World programming language ranking in January 2022

Lr-link Lianrui fully understands the server network card

Decision tree (hunt, ID3, C4.5, cart)

ACTS:如何让缺陷无处藏身?

碳路先行,华为数字能源为广西绿色发展注入新动能

免费数据 | 新库上线 | CnOpenData全国文物商店及拍卖企业数据

Crmeb/v4.4 Standard Version open version mall source code applet official account h5+app mall source code

Redis persistence (young people always set sail with a fast horse, with obstacles and long turns)

华为设备配置跨域虚拟专用网

How to quickly find the official routine of STM32 Series MCU
随机推荐
PCB ground wire design_ Single point grounding_ Bobbin line bold
2020-12-24
Learning summary 01- machine learning
Anaconda installation and use process
Lr-link Lianrui fully understands the server network card
go单元测试实例;文件读写;序列化
Possible errors during alphapose installation test
Meedu knowledge payment solution v4.5.4 source code
Use pathlib instead of OS and os Common methods of path
Qt生成二维码图片方法
Sorting out relevant programming contents of renderfeature of unity's URP
Database introduction
梅州植物组培实验室建设资料整理
C language test question 3 (advanced program multiple choice questions _ including detailed explanation of knowledge points)
数据中台和数据仓库有什么异同?
Exness: liquidity series - order block, imbalance (II)
How to calculate the handling charge of international futures gold?
PostgreSQL database replication - background first-class citizen process walreceiver receiving and sending logic
Redis master-slave replication, sentinel, cluster cluster principle + experiment (wait, it will be later, but it will be better)
Leetcode question brushing series - mode 2 (datastructure linked list) - 160:intersection of two linked list