当前位置:网站首页>NLP model Bert: from introduction to mastery (2)
NLP model Bert: from introduction to mastery (2)
2020-11-06 01:22:00 【Elementary school students in IT field】
Named entity recognition
First download the corresponding bert modular
pip install bert-base==0.0.9 -i https://pypi.python.org/simple
Also can reference Official website Handle
install
What the package now supports
1. Named entity recognition training
2. Services for Named Entity Recognition C/S
3. Inherit excellent open source software :bert_as_service(hanxiao) Of BERT All services
4. Text categorization Services
The following functions will continue to increase
Training named entity recognition model based on named row :
installed bert-base after , Two tools based on named rows will be generated , among bert-base-ner-train Support the training of named entity recognition model , You just need to specify the directory of training data ,BERT The directory of relevant parameters can be . You can use the following command to view help
The examples of training are named as follows :
bert-base-ner-train \
-data_dir {your dataset dir}\
-output_dir {training output dir}\
-init_checkpoint {Google BERT model dir}\
-bert_config_file {bert_config.json under the Google BERT model dir} \
-vocab_file {vocab.txt under the Google BERT model dir}
Parameter description
among data_dir It's the directory where your data is located , Training data , The naming format of validation data and test data is :train.txt, dev.txt,test.txt, Please name the file in this format , Otherwise, an error will be reported .
The format of training data is as follows :
The sea O
fishing O
Than O
" O
The earth O
spot O
stay O
mansion B-LOC
door I-LOC
And O
gold B-LOC
door I-LOC
And O
between O
Of O
The sea O
Domain O
. O
The first word in each line is , The second is its label , Use spaces ’ ' Separate , Please make sure to use spaces . Use blank lines between sentences . The program will automatically read your data .
output_dir: Training model output file path , Model checkpoint And some tag mapping tables will be stored here , This path is used as a service , Can be specified as -ner_model_dir
init_checkpoint: Download Google BERT Model
bert_config_file : Google BERT Under the model bert_config.json
vocab_file: Google BERT Under the model vocab.txt
After training , You can specify in your output_dir To see the results of your training .
More operations :
https://blog.csdn.net/macanv/article/details/85684284
One more bert Encapsulation of models
https://www.jianshu.com/p/1d6689851622
https://cloud.tencent.com/developer/article/1470051
https://www.h3399.cn/201908/714454.html
版权声明
本文为[Elementary school students in IT field]所创,转载请带上原文链接,感谢
边栏推荐
- Use of vuepress
- How to select the evaluation index of classification model
- 业内首发车道级导航背后——详解高精定位技术演进与场景应用
- Natural language processing - BM25 commonly used in search
- After brushing leetcode's linked list topic, I found a secret!
- htmlcss
- 6.3 handlerexceptionresolver exception handling (in-depth analysis of SSM and project practice)
- PHP应用对接Justswap专用开发包【JustSwap.PHP】
- Summary of common algorithms of linked list
- Architecture article collection
猜你喜欢
PHPSHE 短信插件说明
熬夜总结了报表自动化、数据可视化和挖掘的要点,和你想的不一样
Arrangement of basic knowledge points
Didi elasticsearch cluster cross version upgrade and platform reconfiguration
前端基础牢记的一些操作-Github仓库管理
一篇文章带你了解CSS3圆角知识
Swagger 3.0 天天刷屏,真的香嗎?
比特币一度突破14000美元,即将面临美国大选考验
Calculation script for time series data
快快使用ModelArts,零基礎小白也能玩轉AI!
随机推荐
What is the difference between data scientists and machine learning engineers? - kdnuggets
2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
Word segmentation, naming subject recognition, part of speech and grammatical analysis in natural language processing
Python3 e-learning case 4: writing web proxy
6.1.2 handlermapping mapping processor (2) (in-depth analysis of SSM and project practice)
Want to do read-write separation, give you some small experience
向北京集结!OpenI/O 2020启智开发者大会进入倒计时
Deep understanding of common methods of JS array
Grouping operation aligned with specified datum
Natural language processing - BM25 commonly used in search
Calculation script for time series data
What is the side effect free method? How to name it? - Mario
Aprelu: cross border application, adaptive relu | IEEE tie 2020 for machine fault detection
Didi elasticsearch cluster cross version upgrade and platform reconfiguration
如何玩转sortablejs-vuedraggable实现表单嵌套拖拽功能
This article will introduce you to jest unit test
Elasticsearch 第六篇:聚合統計查詢
High availability cluster deployment of jumpserver: (6) deployment of SSH agent module Koko and implementation of system service management
ES6 essence:
Windows 10 tensorflow (2) regression analysis of principles, deep learning framework (gradient descent method to solve regression parameters)