当前位置:网站首页>9. naive Bayes
9. naive Bayes
2022-06-30 05:17:00 【CGOMG】
Introduction to naive Bayes ( Probability Classification )
Probability basis
joint probability 、 Conditional probability and mutual independence
Bayes' formula
Introduce
Case study
API
Emotional analysis of commodity reviews
Import dependence
import pandas as pd
import numpy as py
import jieba
import matplotlib.pyplot as plt
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
get data
data = pd.read_csv("evaluation.csv",encoding="gbk")
data
Basic data processing
# Take out the content column , For later analysis
content = data[" Content "]
content
# Convert the positive and negative comments in the evaluation into figures
data.loc[data.loc[:," evaluation "] == " Praise "," Comment number "] =1
data.loc[data.loc[:," evaluation "] == " Bad review "," Comment number "] =0
data
# Choose a stop word
stopwords = []
with open("stopwords.txt","r",encoding="utf-8") as f:
lines = f.readlines()
for tmp in lines:
line = tmp.strip()
stopwords.append(line)
stopwords = list(set(stopwords))
print(" Stop words :\n",stopwords)
# Convert content to standard format
comment_list = []
for tmp in content:
# Cut words into words
seg_list = jieba.cut(tmp,cut_all=False)
seg_str = ",".join(seg_list)
comment_list.append(seg_str)
comment_list
# Count the number of words
con = CountVectorizer(stop_words=stopwords)
X = con.fit_transform(comment_list)
X.toarray()
## Prepare training and test sets
x_train = X.toarray()[:10,:]
y_train = data[" evaluation "][:10]
print(" Training set :\n",x_train)
print(" Training set :\n",y_train)
x_test = X.toarray()[10:,:]
y_test = data[" evaluation "][10:]
print(" Test set :\n",x_test)
print(" Test set :\n",y_test)
model training
mb = MultinomialNB(alpha=1)
mb.fit(x_train,y_train)
y_pre = mb.predict(x_test)
Model to evaluate
print(" Predictive value ",y_pre)
print(" True value ",y_test)
mb.score(x_test,y_test)
Advantages and disadvantages of naive Bayes
Naive Bayesian content summary
NB Principle
Naive Bayes where is simplicity
Why introduce the conditional independence assumption
In estimating conditional probability P(X|Y) The probability of occurrence is 0 How to deal with the situation of
Why is the assumption of attribute independence difficult to hold in practice , But naive Bayes can still achieve better results
Naive Bayes and LR( Logical regression ) The difference between
边栏推荐
- How can the international trading platform for frying US crude oil guarantee capital security?
- 网络变压器怎么判断好坏?网络滤波变压器坏了一般是什么症状?
- Unity application class and data file path
- Chapter 9 of OpenGL super classic (version 7): fragment processing and frame buffering
- Question mark (?) in Cron expression Use of
- PWN Introduction (2) stack overflow Foundation
- Set a plane to camera viewport
- C # three ways to obtain web page content
- Introduction to mmcv common APIs
- Nestjs配置静态资源,模板引擎以及Post示例
猜你喜欢
Unity + hololens2 performance test
mmcv常用API介绍
Records of some problems encountered during unity development (continuously updated)
Force buckle 27 Removing Elements
Unit asynchronous jump progress
【VCS+Verdi聯合仿真】~ 以計數器為例
The file has been downloaded incorrectly!
VFPBS上传EXCEL并保存MSSQL到数据库中
How does unity use mapbox to implement real maps in games?
【 VCS + Verdi joint simulation】 ~ Taking Counter as an Example
随机推荐
PWN Introduction (2) stack overflow Foundation
Unity supports the platform # define instruction of script
RedisTemplate 常用方法汇总
2021-06-17 solve the problem of QML borderless window stretching, window jitter and flicker when stretching and shrinking
Special folders in unity3d and their meanings
[vcs+verdi joint simulation] ~ take the counter as an example
Unity screenshot method
Introduction to mmcv common APIs
Another download address for typro
Four methods of unity ugui button binding events
【 VCS + Verdi joint simulation】 ~ Taking Counter as an Example
Writing unityshader with sublimetext
Chapter 7 vertex processing and drawing commands of OpenGL super classic (7th Edition)
Unity2019.3.8f1 development environment configuration of hololens2
Very nervous. What should I do on the first day of software testing?
虚析构和纯虚析构
mmdet之Loss模块详解
【VCS+Verdi聯合仿真】~ 以計數器為例
Detailed explanation of the loss module of mmdet
Pit of smoothstep node in shadergraph