当前位置：网站首页>Paper reading: deep forest / deep forest /gcforest

Paper reading: deep forest / deep forest /gcforest

2022-07-28 22:45:00 【Claire_ Shang】

Recently, this article was reported at the group meeting , Simply sort out your thoughts , By the way, when searching deep forests, you may see Deep Forest: Towards an Alternative to Deep Neural Networks, The contents of these two articles are basically the same , There are only a few small ones in narration Different .

Here's what I did ppt Contents of Li ： Reference article ：http://t.csdn.cn/iSKfj

primary coverage

The deep learning model is mainly based on Neural Network , That is, a multi-layer parameterized differentiable nonlinear module that can be trained by back-propagation . Explore the possibility of non differentiable modules to build depth models , Propose a deep learning model --gcforest(multi-Grained Cascade Forest)

characteristic ：

（1） Few super parameters

（2） The complexity of the model can be automatically determined by means of data correlation

（3） The depth model can be implemented without using back propagation

Ask questions ：

（1） Depth model =DNN？ The depth model must be built with differentiable modules ？

（2） Is it possible to train a depth model that does not need back propagation ？

（3） Is it possible to make the depth model win the task , Like random forest ？

inspire 1： Integrated learning

In order to build a good integration model , Individual learning should be accurate and diverse .

Actions to improve model diversity ：

（1） Data samples ： Generate different data samples from different individuals

（2） Input characteristics ： Different feature selection spanning tree models are different

（3） Learning parameters are different

（4） The output represents ： Use different output representations for different individuals .

inspire 2：DNN

Advantages of depth model : Layer by layer （ chart 1） Feature transformation in the model Huge model complexity

deficiencies : There are many super parameters A lot of training data is needed The network architecture must be determined before training

The author believes that layer by layer processing is DNN The key to success , Pictured 1 Shown , With the deepening of the network level , Higher level abstract features will gradually appear .

Cascade forest structure

Use different kinds of trees to improve the diversity of the model

Each layer of the cascade forest in the figure includes two random forests （ black ） And two extreme random forests （ Blue ）, Each forest contains 500 A tree .

The main differences between the two forests ：

The sample space is different —— Random characteristic subspace / All sample data

The methods of splitting nodes are different —— The smallest Gini index / Pick one at random

Random forest and extreme random forest ：http://t.csdn.cn/c5BZw

The class distribution of cascade forest estimation forms a class vector , Then connect with the original eigenvector , Enter to the next level .

Suppose there are three classes ; Each of the four forests will produce one 3D Class vectors ; therefore , The next level will get 12 individual (= 3 × 4) Enhanced features .

here , What I understand is that this class of distribution vector formation diagram shows the process of inputting eigenvectors into one of the cascaded forests , So a three-dimensional vector is generated on the right side of the above figure .

In order to reduce the risk of over fitting , The class vector generated by each forest passes k Fold cross validation produces .

Each instance will be used as training data k−1 Time , Then average the generated class vectors to get the final class vector , As the enhancement feature of the next stage of cascade .

After expanding a new cascade level , The performance of the whole cascade can be estimated on the verification set , If there is no significant performance gain , The training process will end ; therefore , The number of cascading levels can be automatically determined . namely gcForest Terminate the training at an appropriate time to adaptively determine the complexity of the model . This makes it suitable for training data of different scales , Not limited to large-scale training data .

Multi granularity scanning

Sliding window is used to scan original features

Connecting the above two steps is gcforest Flow chart of

The following figure , Suppose there is 3 Classes , And use... Separately 100 dimension 200 dimension 300 The window of dimension is in the original 400 Slide on the feature of dimension