当前位置:网站首页>NLP model Bert: from introduction to mastery (2)
NLP model Bert: from introduction to mastery (2)
2020-11-06 01:22:00 【Elementary school students in IT field】
Named entity recognition
First download the corresponding bert modular
pip install bert-base==0.0.9 -i https://pypi.python.org/simple
Also can reference Official website Handle
install

What the package now supports
1. Named entity recognition training
2. Services for Named Entity Recognition C/S
3. Inherit excellent open source software :bert_as_service(hanxiao) Of BERT All services
4. Text categorization Services
The following functions will continue to increase
Training named entity recognition model based on named row :
installed bert-base after , Two tools based on named rows will be generated , among bert-base-ner-train Support the training of named entity recognition model , You just need to specify the directory of training data ,BERT The directory of relevant parameters can be . You can use the following command to view help

The examples of training are named as follows :
bert-base-ner-train \
-data_dir {your dataset dir}\
-output_dir {training output dir}\
-init_checkpoint {Google BERT model dir}\
-bert_config_file {bert_config.json under the Google BERT model dir} \
-vocab_file {vocab.txt under the Google BERT model dir}
Parameter description
among data_dir It's the directory where your data is located , Training data , The naming format of validation data and test data is :train.txt, dev.txt,test.txt, Please name the file in this format , Otherwise, an error will be reported .
The format of training data is as follows :
The sea O
fishing O
Than O
" O
The earth O
spot O
stay O
mansion B-LOC
door I-LOC
And O
gold B-LOC
door I-LOC
And O
between O
Of O
The sea O
Domain O
. O
The first word in each line is , The second is its label , Use spaces ’ ' Separate , Please make sure to use spaces . Use blank lines between sentences . The program will automatically read your data .
output_dir: Training model output file path , Model checkpoint And some tag mapping tables will be stored here , This path is used as a service , Can be specified as -ner_model_dir
init_checkpoint: Download Google BERT Model
bert_config_file : Google BERT Under the model bert_config.json
vocab_file: Google BERT Under the model vocab.txt
After training , You can specify in your output_dir To see the results of your training .
More operations :
https://blog.csdn.net/macanv/article/details/85684284
One more bert Encapsulation of models
https://www.jianshu.com/p/1d6689851622
https://cloud.tencent.com/developer/article/1470051
https://www.h3399.cn/201908/714454.html
版权声明
本文为[Elementary school students in IT field]所创,转载请带上原文链接,感谢
边栏推荐
- Analysis of react high order components
- Network security engineer Demo: the original * * is to get your computer administrator rights! 【***】
- Vuejs development specification
- Just now, I popularized two unique skills of login to Xuemei
- 带你学习ES5中新增的方法
- ES6学习笔记(四):教你轻松搞懂ES6的新增语法
- Windows 10 tensorflow (2) regression analysis of principles, deep learning framework (gradient descent method to solve regression parameters)
- Want to do read-write separation, give you some small experience
- JVM memory area and garbage collection
- 使用 Iceberg on Kubernetes 打造新一代云原生数据湖
猜你喜欢

使用 Iceberg on Kubernetes 打造新一代云原生数据湖

Aprelu: cross border application, adaptive relu | IEEE tie 2020 for machine fault detection

What is the side effect free method? How to name it? - Mario

Architecture article collection

阿里云Q2营收破纪录背后,云的打开方式正在重塑
![[JMeter] two ways to realize interface Association: regular representation extractor and JSON extractor](/img/cc/17b647d403c7a1c8deb581dcbbfc2f.jpg)
[JMeter] two ways to realize interface Association: regular representation extractor and JSON extractor

DevOps是什么

Thoughts on interview of Ali CCO project team

中国提出的AI方法影响越来越大,天大等从大量文献中挖掘AI发展规律

git rebase的時候捅婁子了,怎麼辦?線上等……
随机推荐
JVM memory area and garbage collection
Real time data synchronization scheme based on Flink SQL CDC
How long does it take you to work out an object-oriented programming interview question from Ali school?
Flink的DataSource三部曲之二:内置connector
Architecture article collection
做外包真的很难,身为外包的我也无奈叹息。
ipfs正舵者Filecoin落地正当时 FIL币价格破千来了
git rebase的時候捅婁子了,怎麼辦?線上等……
速看!互联网、电商离线大数据分析最佳实践!(附网盘链接)
数据产品不就是报表吗?大错特错!这分类里有大学问
Common algorithm interview has been out! Machine learning algorithm interview - KDnuggets
Existence judgment in structured data
Didi elasticsearch cluster cross version upgrade and platform reconfiguration
小程序入门到精通(二):了解小程序开发4个重要文件
6.6.1 localeresolver internationalization parser (1) (in-depth analysis of SSM and project practice)
Deep understanding of common methods of JS array
ES6学习笔记(五):轻松了解ES6的内置扩展对象
Keyboard entry lottery random draw
Windows 10 tensorflow (2) regression analysis of principles, deep learning framework (gradient descent method to solve regression parameters)
H5 makes its own video player (JS Part 2)