当前位置:网站首页>Decision tree and random forest
Decision tree and random forest
2022-06-26 17:51:00 【lmn_】

0x01 Decision tree Overview
Decision tree is a model for classification and regression , It is a supervised machine learning algorithm , It can be used for classification and regression problems . The tree answers successive questions , These questions enable us to follow a certain route of the tree with the answers given .
When building the decision tree , We know which variable and which value the variable uses to split the data , So as to quickly predict the results .

Advantages of decision tree
- Easy to interpret and visualize
- Internal operations can be observed , This makes replication possible
- Can quickly adapt to data sets
- Can handle numerical and categorical data
- have access to “ Trees ” Figure view and explain the final model in an orderly manner
- Good performance on large datasets
- Extremely fast
Disadvantages of decision tree
- Building a decision tree requires an algorithm that can determine the best choice for each node
- Decision trees are prone to over fitting , Especially when the tree is very deep
0x02 Random forest Overview
Forests have almost the same hyperparameters as decision trees , Generally speaking , A tree cannot get effective and desired results , At this time, we need to use the concept of random forest , Random forest is a kind of forest used for classification 、 Integrated learning methods for regression and other tasks .

Random forest can be understood as a group of decision trees , It is the aggregation of many decisions into one result , By constructing a large number of decision trees during training , Is a tree based machine learning algorithm , It uses the power of multiple decision trees to make decisions .
When building the random forest algorithm model , We have to define how many trees to make and how many variables are required for each node .
1995 year , Tin Kam Ho The first random decision forest algorithm is created by using the random subspace method , stay Ho In the formula of , This is a method to realize random discrimination ” Method of classification .
Methods of random forest variance reduction :
- Train different data samples
- Use random feature subsets

Random forest advantages
- Random decision forest corrects the over fitting of decision tree
- Random forests are usually better than decision trees , But they are less accurate than gradient lifting trees
- More trees will improve performance and make predictions more stable
Random forest disadvantages
- The random forest algorithm model is more complex , Because it is a combination of decision trees
- More trees will slow down the computation
0x03 The difference between decision tree and random forest
The key difference between random forest algorithm and decision tree is , A decision tree is a graph that uses a branching method to illustrate all possible outcomes of a decision . by comparison , The output of the random forest algorithm is a set of decision trees that work according to the output .
Decision tree is relative to decision forest , The model is better built , For random forests , The visualization of the final model is poor , If the amount of data is too large or there is no appropriate processing method to process the data , It will take a long time to create .
There is always a space of over fitting in the decision tree ; Random forest algorithm avoids and prevents over fitting by using multiple trees .
Decision trees require low computation , Thus, the implementation time is reduced and the precision is low ; Random forests consume more computation . The process of generation and analysis is very time-consuming .
Decision trees can be easily visualized ; Random forest visualization is complex .
0x04 Build
Pruning is further chopping these branches . It serves as a classification to subsidize data in a better way . Just as we say the way to trim the excess parts , It works on the same principle .
Reach leaf node , Trim end . It is a very important part of the decision tree .
0x05 summary
Compared to random forests , The decision tree is very easy . The decision tree combines some decisions , The random forest combines several decision trees .
Decision trees are fast and easy to operate on large datasets . Stochastic forest models require rigorous training , A lot of random forests , More time .

边栏推荐
- Prometeus 2.34.0 new features
- The king of Internet of things protocol: mqtt
- The latest masterpiece of Alibaba, which took 182 days to produce 1015 pages of distributed full stack manual, is so delicious
- map和filter方法对于稀缺数组的处理
- [qt learning notes]qt inter thread data communication and data sharing
- 直播预告|程序员进击,如何提升研发效能?6月21日晚视频号、B站同步直播,不见不散!
- The high concurrency system is easy to play, and Alibaba's new 100 million level concurrent design quick notes are really fragrant
- Leetcode - 226. Retourner l'arbre binaire (bfs)
- SIGIR 2022 | University of Hong Kong and others proposed the application of hypergraph comparative learning in Recommendation System
- Viteconfigure project path alias
猜你喜欢

#25class的类继承

14《MySQL 教程》INSERT 插入数据

Army chat -- registration of Registration Center

Niuke network: Design LRU cache structure design LFU cache structure

清华&商汤&上海AI&CUHK提出Siamese Image Modeling,兼具linear probing和密集预测性能!

Uncover the secret of Agora lipsync Technology: driving portraits to simulate human speech through real-time voice

宝藏又小众的CTA动画素材素材网站分享

MySQL add column failed because there was data before, not null by default

RSA概念详解及工具推荐大全 - lmn

sparksql如何通过日期返回具体周几-dayofweek函数
随机推荐
一起备战蓝桥杯与CCF-CSP之大模拟炉石传说
14《MySQL 教程》INSERT 插入数据
vue--vuerouter缓存路由组件
[uniapp] the uniapp mobile terminal uses uni Troubleshooting of navigateback failure
vutils.make_grid()与黑白图像有关的一个小体会
直播预告|程序员进击,如何提升研发效能?6月21日晚视频号、B站同步直播,不见不散!
【NPOI】C#跨工作薄复制Sheet模板导出Excel
9、智慧交通项目(2)
Knapsack problem with dependency
Use middleware to record slow laravel requests
你好,现在网上股票开户买股票安全吗?
Rich professional product lines, and Jiangling Ford Lingrui · Jijing version is listed
Tencent qianzhiming: Exploration and application of pre training methods in information flow business
Platform management background and merchant menu resource management: Design of platform management background data service
Preparing for the Blue Bridge Cup and ccf-csp
Platform management background and merchant menu resource management: access control design of platform management background
wechat_ Solve the problem of page Jump and parameter transfer by navigator in wechat applet
【推荐系统学习】推荐系统架构
Various types of gypsum PBR multi-channel mapping materials, please collect them quickly!
ACL 2022 | zero sample multilingual extracted text summarization based on neural label search