当前位置:网站首页>Chapter 6 promotion
Chapter 6 promotion
2022-07-28 13:18:00 【Sang zhiweiluo 0208】
1 The characteristics of random forest
The decision tree of random forest is established by sampling respectively , Relatively independent .
ps: By weak classifier ——> Strong classifier method : Sample weighting 、 Classifier weighting .
Sample weighting : For example, classify a sample , There will be samples with wrong classification , Increase its weight .
Classifier weighting : For weak classifiers with low misclassification rate , We give higher weight to the final result .
Weight refers to the predicted value .
2 promote
2.1 promote
promote —— Promotion is a machine learning technology , It can be used for regression and classification problems , It generates a weak prediction model at every step ( Such as the decision tree ), And weighted and accumulated into the total model .
Gradient rise —— If the weak prediction model of each step is generated according to the gradient direction of loss function , It's called gradient ascension .
The theoretical significance of promotion —— If a problem exists Weak classifier , You can get Strong Classifier .
2.2 Gradient lifting algorithm
The gradient lifting algorithm first gives a target loss function ( We give it according to the actual problems , It has nothing to do with ascension ), Its definition domain is all feasible Weak function set ( Base function ). The lifting algorithm selects one through iteration In the direction of negative gradient To gradually approach Local minima . This view of gradient lifting in function domain has a profound impact on many fields of machine learning .
2.3 Lifting algorithm
Given the input vector x And output variables y A number of training samples (x1,y1),(x2,y2),……(xn,yn), The goal is to find an approximate function
, Make the loss function L(y,F(x)) The loss value of is the smallest .
Loss function L(y,F(x)) The typical definition of is :
or 
Suppose the optimal function is
, namely :![F^{*}(\overrightarrow{x})=\underset{F}{arg min}E_{(x,y)}[L(y,F(\overrightarrow{x}))]](/img/36/b3aab1f24519e3db3a092a577ae950.gif)
Assume F(x) It's a family of basis functions
Weighted sum of 
prove : The median is the absolute minimum optimal solution .
Given the sample x1,x2,……,xn, Calculation 

Finding partial derivatives :
, Make it equal to 0.
Before getting k The number of samples is the same as that after n-k The number of samples is the same , namely
Is the median .
Lifting algorithm derivation :

Gradient approximation :

Lifting algorithm :
3 Gradient lift decision tree GBDT
3.1 Definition
3.2 summary
4 Objective function
4.1 Second order derivative information

4.2 Calculation of objective function
4.3 Simplification of objective function
5 Adaboost
5.1 Adaboost Definition
Set up training data set 
Initialize the weight distribution of training data :
,
5.2 Adaboost Algorithm


5.3 Illustrate with examples
m=1:
| Serial number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | x |
| X | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Y | 1 | 1 | 1 | -1 | -1 | -1 | 1 | 1 | 1 | -1 |
The weight distribution is D1 On the training data , threshold v take 2.5 The time error rate is the lowest , So the basic classifier is 
We can see x=6、7、8 There is error in the data , So the error rate is 0.3, namely e1=0.3=3*0.1.
Plug in G1 The coefficient of :
∴
classifier
On the training data set, there are 3 A misclassification point .

Calculate weights 
You can see the point of error x=6、7、8 The weight of the . For the next basic classifier , namely m=2 when .
m=2:
| X | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
| Y | 1 | 1 | 1 | -1 | -1 | -1 | 1 | 1 | 1 | -1 |
| w | 0.0715 | 0.0715 | 0.0715 | 0.0715 | 0.0715 | 0.0715 | 0.1666 | 0.1666 | 0.1666 | 0.0715 |
The weight distribution is D2 On the training data , threshold v take 8.5 The time error rate is the lowest , So the basic classifier is 
You can see x=3、4、5 There is an error , So the error rate is 0.2143, namely e2=0.2143=0.0715*3
Plug in G1 The coefficient of 

classifier
On the training data set, there are 3 A misclassification point .
Calculate weights 
You can see x=3、4、5 The weight of the . For the next basic classifier , namely m=3 when .
m=3 By analogy ……
5.4 The key explanation of weight and error rate

边栏推荐
- [basic teaching of Bi design] detailed explanation of OLED screen use - single chip microcomputer Internet of things
- Machine learning Basics - integrated learning-13
- Installation and reinstallation of win11 system graphic version tutorial
- Led aquarium lamp touch chip-dlt8t02s-jericho
- 【嵌入式C基础】第9篇:C语言指针的基本用法
- Chapter 6 提升
- 2020jenkins study notes
- Black cat takes you to learn EMMC Protocol Part 26: hardware reset operation of EMMC (h/w reset)
- [embedded C foundation] Part 3: constants and variables
- [graduation design teaching] ultrasonic ranging system based on single chip microcomputer - Internet of things embedded stm32
猜你喜欢

Risk analysis of option trading

【嵌入式C基础】第4篇:运算符的使用

Original juice multifunctional Juicer touch chip-dlt8t02s-jericho

Is jetpack compose completely out of view?

RGB game atmosphere light touch chip-dlt8s04a-jericho

夜神模拟器抓包微信小程序

Comments are not allowed in JSON

What if the right button of win11 start menu doesn't respond

butterfly spreads

The essence of enterprise Digitalization
随机推荐
Call / put option price curve
Black cat takes you to learn EMMC Protocol Part 26: hardware reset operation of EMMC (h/w reset)
Machine learning Basics - decision tree-12
Understanding of vite2
How much do you know about JVM memory management
Is jetpack compose completely out of view?
夜神模拟器抓包微信小程序
[embedded C foundation] Part 9: basic usage of C language pointer
Databinding+LiveData轻松实现无重启换肤
10、 Kubernetes scheduling principle
[FPGA]: Joint Simulation of FPGA and MATLAB
2020-12-07
gicv3 spi register
Brief introduction to JS operator
Chapter 6 提升
[FPGA]: ise and Modelsim joint simulation
Brother bird talks about cloud native security best practices
Black cat takes you to learn EMMC protocol chapter 27: what is EMMC's dynamic capacity?
Vditor 渲染器如何做到服务端渲染(SSR)?
SSH port forwarding (Tunneling Technology)




