当前位置:网站首页>Hands on data analysis data modeling and model evaluation
Hands on data analysis data modeling and model evaluation
2022-06-25 01:22:00 【includeSteven】
Data modeling and evaluation
Introduce
After data processing and Preliminary visual analysis , We can use the data to get the information we want . The first step of data analysis is modeling , After modeling, we need to evaluate whether our model is reliable .
Data modeling
The modeling library used here is sklearn, It contains many algorithms of machine learning , The corresponding model algorithm selection path can refer to the following figure :

Divide the data set
First, the data set should be divided into training set and test set , What we use here is sklearn.model_selection.train_test_split Method , Can pass jupyter Of train_test_split? View the documentation for the method .
Note that random selection is used by default for cutting data sets , It needs to be judged according to the actual situation .
Model creation
stay sklearn in , All estimators are inherited from estimator, All pass fit Method to build the model , Use predict To predict the outcome .
For classification , You can use logistic regression or random forest , Corresponding to the following two classes :
- sklearn.liner_model.LogisticRegression
- sklearn.ensemble.RandomForestClassifier
Model to predict
After building the model , Can pass predict Method to predict the model , Input eigenvalue x, The corresponding label will be given y value .
You can also use predict_proba To get the probability of each tag corresponding to the model prediction .
Evaluation of the model
Cross validation
sklearn.model_selection.cross_val_score(estimator, X_train, y_train, cv=10): Output the score of each cross validation
Confusion matrix and corresponding probability calculation
- sklearn.metrics.confusion_matrix
- sklearn.metrics.classification_report
draw ROC curve
sklearn.metrics.roc_curve, The return value is false positive rate、true positive rate and thresholds
边栏推荐
- 卷积与转置卷积
- Bi-sql like
- 论文翻译 | RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
- Transform BeanUtils to achieve list data copy gracefully
- Redis basic commands and types
- Deep learning LSTM model for stock analysis and prediction
- Powerbi - for you who are learning
- 天书夜读笔记——反汇编引擎xde32
- Mysql database Chapter 1 Summary
- Tencent has completed the comprehensive cloud launch to build the largest cloud native practice in China
猜你喜欢

4年工作經驗,多線程間的5種通信方式都說不出來,你敢信?

Bi-sql - join

Première application de l'informatique quantique à la modélisation des flux de puissance dans les systèmes énergétiques à l'Université technique danoise

百度语音合成语音文件并在网站中展示

Bi-sql - different join

Deep learning LSTM model for stock analysis and prediction

Bi-sql create

Abnova丨A4GNT多克隆抗体中英文说明

明日考试 最后一天如何备考?二造考点攻略全整理

Q1季度逆势增长的华为笔电,正引领PC进入“智慧办公”时代
随机推荐
MySQL gets the primary key and table structure of the table
Linux64Bit下安装MySQL5.6-不能修改root密码
VB learning notes
Bi-sql delete
excel 汉字转拼音「建议收藏」
利用 Redis 的 sorted set 做每周热评的功能
C语言边界计算和不对称边界
Welcome to the new world of Lenovo smart screen
腾讯完成全面上云 打造国内最大云原生实践
PHP easywechat and applet realize long-term subscription message push
Which securities company should I choose to open an account online? Is it safe to open an account online?
丹麥技術大學首創將量子計算應用於能源系統潮流建模
天书夜读笔记——8.4 diskperf反汇编
Reading notes at night -- deep into virtual function
天书夜读笔记——反汇编引擎xde32
4 ans d'expérience de travail, 5 modes de communication Multi - thread ne peuvent pas être décrits, vous osez croire?
AutoCAD - two extension modes
The latest QQ wechat domain name anti red PHP program source code + forced jump to open
音频PCM数据计算声音分贝值,实现简单VAD功能
Bi SQL constraints