当前位置:网站首页>Naive bayes
Naive bayes
2022-08-05 10:57:00 【Ding Jiaxiong】
12. Naive Bayes
Article table of contents
12.1 Introduction
Categories
12.2 Basics of Probability
12.2.1 Definition of probability
- The likelihood of an event happening
- P(X) : value in [0, 1]
12.2.2 Joint Probabilities
- Contains multiple conditions, and the probability that all conditions are met at the same time
- Denoted as: P(A,B)
12.2.3 Conditional Probabilities
- Probability of event A when another event B has already occurred
- Denoted as: P(A|B)
12.2.4 independent of each other
- If P(A, B) = P(A)P(B), then event A and event B are said to be independent of each other
12.2.5 Bayesian Formula
12.2.6 Naive Bayes
- A Bayesian formula that assumes independence between features
12.3 API
sklearn.naive_bayes.MultinomialNB(alpha = 1.0)
- Naive Bayesian Classification
- alpha: Laplace smoothing coefficient
12.4 Algorithm Summary
12.4.1 Advantages
- Naive Bayesian model originated from classical mathematical theory and has stable classification efficiency
- It is less sensitive to missing data and the algorithm is relatively simple, which is often used for text classification
- High classification accuracy and fast speed
12.4.2 Disadvantages
- It does not work well if the feature attributes are correlated due to the assumption of sample attribute independence
- The prior probability needs to be calculated, and the prior probability often depends on the hypothesis. There can be many kinds of hypothetical models, so in some cases, the prediction effect will be poor due to the hypothesized prior model.
12.4.3 Principles of NB
Naive Bayes is a classification method based on Bayes' theorem and the assumption of feature condition independence
- For a given item xx to be classified, calculate the posterior probability distribution through the learned model,
- That is: the probability of each target category appearing under the condition that this item appears, and the category with the largest posterior probability is taken as the category to which xx belongs.
12.4.4 Why Simple
- When calculating the conditional probability distribution P(X=x∣Y=c_k), NB introduces a strong conditional independence assumption, that is, when Y is determined, the values of each feature component of X are independent of each other
12.4.5 Why is the conditional independence assumption introduced
- In order to avoid the problem of combinatorial explosion and sample sparseness when solving Bayes' theorem
12.4.6 What should I do if the probability is 0 when estimating the conditional probability P(X∣Y)
Introduce λ
- When λ=0, it is an ordinary maximum likelihood estimation
- When λ=1, it is called Laplace smoothing
12.4.7 Difference between Naive Bayes and LR
One
- Naive Bayes is a generative model
- LR is a discriminant model
Two
- Naive Bayes is based on a strong assumption of conditional independence (under the condition that the classification Y is known, the values of each feature variable are independent of each other)
- LR does not require this
Three
- Naive Bayes is suitable for small datasets
- LR is suitable for large datasets
边栏推荐
- 电气工程的标准是什么
- linux下oracle常见操作以及日常积累知识点(函数、定时任务)
- Ali's new launch: Microservices Assault Manual, all operations are written out in PDF
- 字节一面:TCP 和 UDP 可以使用同一个端口吗?
- #yyds干货盘点#JS数组和树相互转化
- 提取人脸特征的三种方法
- HDD杭州站•ArkUI让开发更灵活
- 2022 Hangzhou Electric Power Multi-School Session 6 1008.Shinobu Loves Segment Tree Regular Questions
- How to choose coins and determine the corresponding strategy research
- 金融业“限薪令”出台/ 软银出售过半阿里持仓/ DeepMind新实验室成立... 今日更多新鲜事在此...
猜你喜欢
RT - Thread record (a, RT, RT Thread version - Thread Studio development environment and cooperate CubeMX quick-and-dirty)
Microcontroller: temperature control DS18B20
智能算力的枢纽如何构建?中国云都的淮海智算中心打了个样
Scaling-law和模型结构的关系:不是所有的结构放大后都能保持最好性能
负载均衡应用场景
【综合类型第 35 篇】程序员的七夕浪漫时刻
What are the standards for electrical engineering
单片机:温度控制DS18B20
SQL外连接之交集、并集、差集查询
多线程(进阶) - 2.5w字总结
随机推荐
CenOS MySQL入门及安装
苹果Meta都在冲的Pancake技术,中国VR团队YVR竟抢先交出产品答卷
Header file search rules when compiling with GCC
华为分析&联运活动,助您提升游戏总体付费
SkiaSharp 之 WPF 自绘 投篮小游戏(案例版)
Common operations of oracle under linux and daily accumulation of knowledge points (functions, timed tasks)
Http-Sumggling缓存漏洞分析
Google启动通用图像嵌入挑战赛
【AGC】增长服务1-远程配置示例
Detailed explanation of PPOCR detector configuration file parameters
Custom filters and interceptors implement ThreadLocal thread closure
阿里全新推出:微服务突击手册,把所有操作都写出来了PDF
Dynamics 365Online PDF导出及打印
Use KUSTO query statement (KQL) to query LOG on Azure Data Explorer Database
poj2935 Basic Wall Maze (2016xynu暑期集训检测 -----D题)
导火索:OAuth 2.0四种授权登录方式必读
一张图看懂 SQL 的各种 join 用法!
19.3 restart the Oracle environment
UDP通信
uniapp中的view高度设置100%