当前位置:网站首页>Fasttext text text classification
Fasttext text text classification
2022-07-02 04:56:00 【MasonYyp】
1 install fastText
facebook Reference address
https://github.com/facebookresearch/fastText
fastText Installation package
https://www.lfd.uci.edu/~gohlke/pythonlibs/#fasttext
Use tar File installation is troublesome , It is recommended to use whl install
pip install fasttext‑0.9.2‑cp38‑cp38‑win_amd64.whl
Developing documents
# python Developing documents
https://fasttext.cc/docs/en/python-module.html
# js Developing documents
https://fasttext.cc/docs/en/webassembly-module.html
2 Source file
import fasttext
# Eliminate warnings
# Warning : `load_model` does not return WordVectorModel or SupervisedModel any more, but a `FastText` object which is very similar
fasttext.FastText.eprint = lambda x: None
# Classification model name
classifier_model_name = "model_classify.bin"
# Training models
def train_model():
# Value range of parameter standard : lr =[0.1, 1.0], epoch=[5-50], wordNgrams=[1-5]
# loss Parameters
# When implementing multi label classification ,loss=ova,ova Express one-vs-all
# When the amount of data is large ,loss=hs,hs Express hierarchical softmax
model = fasttext.train_supervised("train.txt", lr=0.1, epoch=25, wordNgrams=4, loss='softmax', label_prefix='__label__')
# Save the model
model.save_model(classifier_model_name)
# test model
def test_model():
# Load model
classifier = fasttext.load_model(classifier_model_name)
# Test data
res_test = classifier.test("test.txt")
print(" Data volume :", res_test[0])
print(" Accuracy rate :", res_test[1])
print(" Recall rate :", res_test[2])
predict_file = open('predict.txt', 'w', encoding='utf-8')
with open('test.txt', encoding='utf-8') as fp:
# Format of each row of data : label + Text , Label from ’__label__‘+ Category composition
for line in fp.readlines():
line = line.strip()
# Predicted results , Raw data
predict_file.write(classifier.predict(line)[0][0] + ',\t' + line + '\n')
predict_file.close()
# prediction model
def predict_text():
classifier = fasttext.load_model(classifier_model_name)
# text Is a list of predicted text , k Indicates the number of output tags ,-1 Indicates all outputs
res_predict = classifier.predict(text=[" novel coronavirus pneumonia ", " Intelligent development "], k=-1)
print(" Probability list :", res_predict)
# text Is the predicted text , By default, the label and probability with the greatest similarity are returned
res_predict = classifier.predict(text=" epidemic situation ")
print(" probability :", res_predict)
if __name__ == '__main__':
train_model()
test_model()
predict_text()
3 data format
The training sample
__label__0 Intelligent scientific development , Artificial intelligence science
__label__1 Novel coronavirus pneumonia
__label__0 Deep learning techniques , Machine learning technology
__label__1 The epidemic situation has been effectively controlled
Test samples
__label__0 China has made some progress in the field of artificial intelligence
__label__1 China has effectively controlled the epidemic
边栏推荐
猜你喜欢

LeetCode-对链表进行插入排序

解决:代理抛出异常错误

How to modify data file path in DM database

06 decorator mode

Mysql表insert中文变?号的问题解决办法

Cultivate primary and secondary school students' love for educational robots

MySQL table insert Chinese change? Solution to the problem of No

Cannot activate CONDA virtual environment in vscode

Mysql database learning

社交媒体搜索引擎优化及其重要性
随机推荐
Pytest learning ----- pytest assertion of interface automation testing
LeetCode-对链表进行插入排序
Win10 disk management compressed volume cannot be started
Thinkphp内核工单系统源码商业开源版 多用户+多客服+短信+邮件通知
Comp 250 parsing
ansible安装与使用
Introduction to Luogu 3 [circular structure] problem list solution
Mathematical problems (number theory) trial division to judge prime numbers, decompose prime factors, and screen prime numbers
The underlying principle of go map (storage and capacity expansion)
GeoTrust ov multi domain SSL certificate is 2100 yuan a year. How many domain names does it contain?
Solution of DM database unable to open graphical interface
Leetcode merge sort linked list
Social media search engine optimization and its importance
Mysql重点难题(2)汇总
UNET deployment based on deepstream
Knowledge arrangement about steam Education
Hcip day 17
06 decorator mode
二叉樹解題(二)
初学爬虫-笔趣阁爬虫