当前位置：网站首页>The difference between bagging and boosting in machine learning

The difference between bagging and boosting in machine learning

2022-07-04 04:05:00 【Xiaobai learns vision】

Click on the above “ Xiaobai studies vision ”, Optional plus " Star standard " or “ Roof placement ”

 Heavy dry goods , First time delivery

Bagging and Boosting All of them combine the existing classification or regression algorithms in a certain way , Form a more powerful classifier , More precisely, it's a way to assemble classification algorithms . The method of assembling weak classifiers into strong classifiers .

First introduced Bootstraping, Self help method ： It's a sampling method with put back （ Duplicate samples may be drawn ）.

1. Bagging (bootstrap aggregating)

Bagging The bagging method , The algorithm process is as follows ：

The training set was extracted from the original sample set . Each round is used from the original sample set Bootstraping Method extraction n Training samples （ In training set , Some samples may be taken multiple times , Some samples may not be picked at all ）. Together with k Round draw , obtain k Training set .（k The training sets are independent of each other ）
One model at a time using one training set ,k A total of training sets k A model .（ notes ： There is no specific classification algorithm or regression method , We can adopt different classification or regression methods according to the specific problem , Such as the decision tree 、 Sensors etc. ）
Pair classification problem ： I'm going to go up k The classification results were obtained by voting ; Pair regression problem , The mean value of the above model is calculated as the final result .（ All models are equally important ）

2. Boosting

The main idea is to assemble the weak classifier into a strong classifier . stay PAC（ The probability approximation is correct ） Under the learning framework , Then the weak classifier can be assembled into a strong classifier .

About Boosting The two core issues of this issue ：

2.1 How to change the weight or probability distribution of training data in each round ？

By increasing the weight of the samples that were divided by the weak classifier in the previous round , Reduce the weight of the previous round of pairing samples , So that the classifier has a better effect on the misclassified data .

2.2 How to combine weak classifiers ？

The weak classifiers are combined linearly through the additive model , such as AdaBoost By a weighted majority , That is to increase the weight of the classifier with small error rate , At the same time, the weight of the classifier with high error rate is reduced .

The lifting tree gradually reduces the residual by fitting the residual , The final model is obtained by superimposing the models generated in each step .

3. Bagging,Boosting The difference between the two

Bagging and Boosting The difference between ：

1） Sample selection ：

Bagging： The training set is selected from the original set , The training sets selected from the original set are independent of each other .

Boosting： The training set of each round is the same , Only the weight of each sample in the classifier changes in the training set . And the weight is adjusted according to the last round of classification results .

2） Sample weights ：

Bagging： Use uniform sampling , The weight of each sample is equal

Boosting： Adjust the weight of the sample according to the error rate , The greater the error rate, the greater the weight .

3） Prediction function ：

Bagging： All prediction functions have equal weight .

Boosting： Each weak classifier has its own weight , For the classifier with small classification error, it will have more weight .

4） Parallel computing ：

Bagging： Each prediction function can be generated in parallel

Boosting： Each prediction function can only be generated in sequence , Because the latter model parameter needs the result of the previous model .

4. summary

These two methods are to integrate several classifiers into one classifier , It's just that the way of integration is different , In the end, we get a different effect , The application of different classification algorithms into this kind of algorithm framework will improve the classification effect of the original single classifier to a certain extent , But it also increases the amount of computation .

Here is a new algorithm that combines decision tree with these algorithm frameworks ：

Bagging + Decision tree = Random forests
AdaBoost + Decision tree = Ascension tree
Gradient Boosting + Decision tree = GBDT

download 1：OpenCV-Contrib Chinese version of extension module

stay 「 Xiaobai studies vision 」 Official account back office reply ： Extension module Chinese course , You can download the first copy of the whole network OpenCV Extension module tutorial Chinese version , cover Expansion module installation 、SFM Algorithm 、 Stereo vision 、 Target tracking 、 Biological vision 、 Super resolution processing And more than 20 chapters .

download 2：Python Visual combat project 52 speak

stay 「 Xiaobai studies vision 」 Official account back office reply ：Python Visual combat project , You can download the Image segmentation 、 Mask detection 、 Lane line detection 、 Vehicle count 、 Add Eyeliner 、 License plate recognition 、 Character recognition 、 Emotional tests 、 Text content extraction 、 face recognition etc. 31 A visual combat project , Help fast school computer vision .

download 3：OpenCV Actual project 20 speak

stay 「 Xiaobai studies vision 」 Official account back office reply ：OpenCV Actual project 20 speak , You can download the 20 Based on OpenCV Realization 20 individual Actual project , Realization OpenCV Learn advanced .

Communication group

Welcome to join the official account reader group to communicate with your colleagues , There are SLAM、 3 d visual 、 sensor 、 Autopilot 、 Computational photography 、 testing 、 Division 、 distinguish 、 Medical imaging 、GAN、 Wechat groups such as algorithm competition （ It will be subdivided gradually in the future ）, Please scan the following micro signal clustering , remarks ：” nickname + School / company + Research direction “, for example ：” Zhang San + Shanghai Jiaotong University + Vision SLAM“. Please note... According to the format , Otherwise, it will not pass . After successful addition, they will be invited to relevant wechat groups according to the research direction . Do not Send ads within the group , Or you'll be invited out , Thanks for your understanding ~