当前位置:网站首页>Data mining 2021-4-27 class notes
Data mining 2021-4-27 class notes
2022-07-03 09:09:00 【weixin_ thirty-seven million six hundred and eighty-two thousan】
data mining
Supervised learning ( classification )
Unsupervised learning ( clustering )
prediction problems:Classification vs. Numeric Prediction
Test set data , It should be strictly separated from the training set . If you have seen your homework before the exam , You can't detect whether you really master
Decision Tree Induction( Decision tree induction )
intuitive , Easy to understand
Information gain , entropy
High entropy , High uncertainty
Low entropy , Low uncertainty
Continuous value :<20, 20~30, 30 ~40, >40
Gain Ratio
Gini Index Used in CART
Overfitting And Tree Pruning
Somewhat branches In response anomalies perhaps noise perhaps outliers
The black line is better than the green line
Yes overfitting Two kinds of approaches
Prepruning:
threshold Difficult to choose
Postpruning:
Prune with another set of data ( Use more )
Advantages of decision trees :
- Fast learning
- It can be transformed into simple and understandable classification conditions
- Easy to use SQL Realization
- Acceptable classification accuracy
Bayes Classification Methods
An example of naive Bayesian classifier :
The advantages of naive Bayes :
Easy to implement , Excellent results
shortcoming : Variables are often not independent
You can use Bayes Bayesian Belief Networks To partially solve the above problems
边栏推荐
- LeetCode 508. The most frequent subtree elements and
- Find the combination number acwing 885 Find the combination number I
- LeetCode 57. Insert interval
- Debug debugging - Visual Studio 2022
- Gif remove blank frame frame number adjustment
- LeetCode 30. 串联所有单词的子串
- AcWing 788. Number of pairs in reverse order
- Slice and index of array with data type
- 【点云处理之论文狂读经典版11】—— Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
- 【点云处理之论文狂读前沿版8】—— Pointview-GCN: 3D Shape Classification With Multi-View Point Clouds
猜你喜欢
LeetCode 515. 在每个树行中找最大值
我們有個共同的名字,XX工
LeetCode 513. 找树左下角的值
Gaussian elimination acwing 883 Gauss elimination for solving linear equations
LeetCode 324. 摆动排序 II
Mortgage Calculator
Format - C language project sub file
【点云处理之论文狂读经典版11】—— Mining Point Cloud Local Structures by Kernel Correlation and Graph Pooling
LeetCode 438. 找到字符串中所有字母异位词
LeetCode 324. Swing sort II
随机推荐
【点云处理之论文狂读经典版7】—— Dynamic Edge-Conditioned Filters in Convolutional Neural Networks on Graphs
Complex character + number pyramid
On a un nom en commun, maître XX.
Sword finger offer II 091 Paint the house
求组合数 AcWing 885. 求组合数 I
First Servlet
低代码前景可期,JNPF灵活易用,用智能定义新型办公模式
createjs easeljs
高斯消元 AcWing 883. 高斯消元解线性方程组
Education informatization has stepped into 2.0. How can jnpf help teachers reduce their burden and improve efficiency?
Memory search acwing 901 skiing
【点云处理之论文狂读前沿版12】—— Adaptive Graph Convolution for Point Cloud Analysis
Discussion on enterprise informatization construction
LeetCode 715. Range 模块
String splicing method in shell
LeetCode 30. 串联所有单词的子串
数字化转型中,企业设备管理会出现什么问题?JNPF或将是“最优解”
file_ put_ contents
AcWing 787. 归并排序(模板)
Character pyramid