当前位置:网站首页>Learning notes of statistical learning methods -- Chapter 1 Introduction to statistical learning methods
Learning notes of statistical learning methods -- Chapter 1 Introduction to statistical learning methods
2022-07-05 21:25:00 【Raymond。】
Statistical learning methods learning notes -- Chapter one Introduction to statistical learning methods
- 1.1 Statistical learning
- 1.1.1 Basic steps of statistical learning
- 1.1.2 Statistical learning classification
- 1.1.3 Three elements of statistical learning method
- 1.1.4 Model evaluation and model selection
- 1.1.5 Regularization and cross validation -- Prevent over fitting
- 1.1.6 Generalization ability
- 1.1.7 Generation model and discrimination model
- 1.2 Supervised learning
1.1 Statistical learning
1.1.1 Basic steps of statistical learning
Steps to achieve statistical learning methods :
- Get a limited set of training data
- Determine the hypothetical space containing all possible models , That is, the set of learning models
- Determine the criteria for model selection , Learning strategies
- Determine the algorithm for solving the optimal model , Learning algorithm
- Choosing the best model by learning method
- Use the learned optimal model to predict and analyze new data
1.1.2 Statistical learning classification
Statistical learning includes supervised learning , Unsupervised learning , Semi supervised learning and intensive learning . Focus on supervised learning .
1.1.3 Three elements of statistical learning method
Statistical learning method = Model + Strategy + Algorithm
Model
A set of all possible mappings from input variables to output variablesStrategy
The optimal model .
How to measure the quality of the model : Loss function L(Y, f(X))( Measure the quality of a forecast ) And risk function ( The prediction of the model is good or bad in the average sense ).Algorithm
The calculation method of solving the optimal model .
1.1.4 Model evaluation and model selection
Training error and test error
Over fitting
The training error is small , The test error is large .( The noise is also studied and predicted )
1.1.5 Regularization and cross validation -- Prevent over fitting
Regularization
Add regularization term or penalty term to empirical risk , Measure the complexity of the model .Cross validation
When the data is enough , Divide the data into training sets , Verification set ( For model selection ) And test set .
1.1.6 Generalization ability
1.1.7 Generation model and discrimination model
- Generate models
Joint distribution by data learning P(X,Y), Find the conditional probability distribution P(Y|X).
Common generation models : Naive Bayes and hidden Markov model - Discriminant model
Learning decision function directly from data f(X) Or conditional probability distribution P(Y|X).
Common discriminant models :k a near neighbor , perceptron , Decision tree , Logistic regression model , Maximum entropy model , Support vector machine , Lifting method and condition random field .
1.2 Supervised learning
1.2.1 Basic concepts
- input space , Feature space and output space
Input ( Output ) Space is input ( Output ) All possible values . Each specific input is an instance , Usually represented by eigenvectors , The space where all eigenvectors exist is the eigenspace . - Classification of prediction problems
The problem that input and output are continuous variables is called regression problem .
The output variables are finite discrete variables, which is called classification problem .
The prediction problem in which both input and output variables are variable sequences is called marking problem . - Hypothetical space
The purpose of supervised learning is to learn a mapping from input to output , A map is represented by a model , The set of all mappings is called the hypothesis space . - Supervised learning model classification
It can be divided into probability models ( By conditional probability distribution P(Y|X) Express ) And non probabilistic models ( Decision function Y=f(X) Express ). The specific model is determined by the specific learning method .
1.2.2 Formalization of problems
- The process
The learning process ( Completed by the learning system ) And the prediction process ( Completed by the prediction system )
1.2.3 Application of supervised learning
Classification problem
The model is a classifier .AEC Dimension
The return question
边栏推荐
- Learning robots have no way to start? Let me show you the current hot research directions of robots
- Écrire une interface basée sur flask
- Is Kai Niu 2980 useful? Is it safe to open an account
- PVC plastic sheets BS 476-6 determination of flame propagation properties
- Zhang Lijun: la pénétration de l’incertitude dépend de quatre « invariants»
- [daily training] 729 My schedule I
- Problems encountered in office--
- Introduction of ArcGIS grid resampling method
- ODPs next map / reduce preparation
- Explain various hot issues of Technology (SLB, redis, mysql, Kafka, Clickhouse) in detail from the architecture
猜你喜欢

uni-app 蓝牙通信
![R language [data management]](/img/41/b89bb8794c06280e58988e1c1a5e02.png)
R language [data management]

LeetCode_哈希表_困难_149. 直线上最多的点数

How to send samples when applying for BS 476-7 display? Is it the same as the display??

Why can't Chinese software companies produce products? Abandon the Internet after 00; Open source high-performance API gateway component of station B | weekly email exclusive to VIP members of Menon w

示波器探头对测量带宽的影响

MySQL deep paging optimization with tens of millions of data, and online failure is rejected!

Golang (1) | from environmental preparation to quick start

EN 438-7建筑覆盖物装饰用层压板材产品—CE认证

张丽俊:穿透不确定性要靠四个“不变”
随机推荐
Deep merge object deep copy of vant source code parsing
vant 源码解析之 utils/index.ts 工具函数
How to prepare for the algorithm interview and answer the algorithm interview questions
Simple interest mode - evil Chinese style
Selenium gets the verification code image in DOM
Add ICO icon to clion MinGW compiled EXE file
面试官:并发编程实战会吗?(线程控制操作详解)
sql常用语法记录
Influence of oscilloscope probe on measurement bandwidth
Cross end solution to improve development efficiency rapidly
EasyExcel的讀寫操作
systemd-resolved 开启 debug 日志
Which securities company is better and which platform is safer for stock account opening
Wood board ISO 5660-1 heat release rate mapping test
Why can't Chinese software companies produce products? Abandon the Internet after 00; Open source high-performance API gateway component of station B | weekly email exclusive to VIP members of Menon w
The primary key is set after the table is created, but auto increment is not set
Generics of TS
Learning notes of SAS programming and data mining business case 19
selenium 获取dom内属性值的方法
有些事情让感情无处安放