当前位置:网站首页>Wenet: E2E speech recognition tool for industrial implementation
Wenet: E2E speech recognition tool for industrial implementation
2022-07-05 04:29:00 【Wang Xiaoxi WW】
WeNet: Industrial oriented E2E Speech recognition tools
List of articles
One 、WeNet Build a speech recognition platform
1、 Reference material
2、 Quickly build WeNet platform
Reference resources WeNet Chinese document
Download the official pre training model , And start the docker service , Load model , Provide websocket Speech recognition service of the Protocol .
wget https://wenet-1256283475.cos.ap-shanghai.myqcloud.com/models/aishell2/20210618_u2pp_conformer_libtorch.tar.gz
tar -xf 20210618_u2pp_conformer_libtorch.tar.gz
model_dir=$PWD/20210618_u2pp_conformer_libtorch
docker run --rm -it -p 10086:10086 -v $model_dir:/home/wenet/model wenetorg/wenet-mini:latest bash /home/run.sh
Note:
there
$PWD = "/home/wenet/model"
.Make sure Pre training model file The storage location of should be correct , Namely decompress in
$PWD
Next , Execute the following commandmodel_dir=$PWD/20210618_u2pp_conformer_libtorch
Assign variables , Otherwise it will be reported :
Real time recognition
Open the file using the browser **index.html, stay WebSocket URL
Fill in the ws://127.0.0.1:10086
( If in windows Pass through wsl2** function docker, Then use ws://localhost:10086
) , Allow browser pop-up requests to use microphones , Real time speech recognition through microphone .
Use here wsl2 Under the docker demonstrate : If you are close to the microphone , The false detection rate is relatively low .
Two 、WeNet Realize reasoning ( Can't use... For the time being onnx cpu Version reasoning )
Note:
If only wenet/bin/recognize.py, Use libTorch Model reasoning , Can be in windows Build an environment in , Refer to for the specific construction process WeNet Official website
If you want to use wenet/bin/recognize_onnx.py Reasoning , Need to download first ctc_encoder, Pay attention here pypi Upper ctc_encoder Only 2020 Version of (WeNet1.0), And current WeNet3.0 Version inconsistency , So we need to go to https://github.com/Slyne/ctc_decoder Download and compile . Due to compilation swig_encoder You need to use bash command , So try to linux Running in the system , Use here WSL + ubuntu As a solution .
Actually windows install git You can execute bash command , It's just that it's being installed here
wget.exe
,swig.exe
,git clone
Corresponding package(kenlm,ThreadPool) after , For downloaded openfst-1.6.3, Even in VC It's complete.h
file , Unable to compile successfully .
1、 build WeNet Environmental Science
Here, due to trying to use onnx The reasoning model , Therefore use WSL + ubuntu As a solution
WSL + Docker Desktop Use tutorial reference WSL Ubuntu + Docker Desktop build python Environmental Science
On completion WSL and Docker Desktop After installation ,WeNet The environment configuration steps are as follows :
Instantiation anaconda Containers
docker run -it --name="anaconda" -p 8888:8888 continuumio/anaconda3 /bin/bash
If you quit , It can be restarted anaconda Containers
# restart docker start anaconda docker exec -it anaconda /bin/bash
stay base Environmental Science The configuration wenet Environmental Science ( Do not create virtual environments , After convenience, it is packaged into an image , for pycharm Use )
take WSL Medium wenet Project copy to docker In the container ( Suppose that WSL Of
/home/usr
There are wenet project )docker cp /home/usr/wenet/requirements.txt 9cf7b3c196f3:/home/ #9cf7b3c196f3 by anaconda Containers id
Get into anaconda In container , stay
/home/
Use pip Install all packages (conda Source modification reference ubuntu Replace conda Source )pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple conda install pytorch=1.10.0 torchvision torchaudio=0.10.0 cudatoolkit=11.1 -c pytorch -c conda-forge
download ctc_encoder project ( Give Way conformer It can be used in speech recognition beam_search Method )
ctc_encoder The official website is as follows :https://github.com/Slyne/ctc_decoder.
because
github clone
stay ubuntu It may not work well , So in windows Intoswig/setup.sh
:#!/usr/bin/env bash if [ ! -d kenlm ]; then git clone https://github.com/kpu/kenlm.git echo -e "\n" fi if [ ! -d openfst-1.6.3 ]; then echo "Download and extract openfst ..." wget http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.6.3.tar.gz --no-check-certificate tar -xzvf openfst-1.6.3.tar.gz echo -e "\n" fi if [ ! -d ThreadPool ]; then git clone https://github.com/progschj/ThreadPool.git echo -e "\n" fi echo "Install decoders ..." # python3 setup.py install --num_processes 10 python3 setup.py install --user --num_processes 10
After installing the necessary packages ( stay
git bash
Use in setup.sh The command ,wget,swig Direct installation exe that will do ), The overall document structure is as follows ( There are four more files ):Then put the complete ctc_encoder Copied to the anaconda In the container , Just compile it .
compile ctc_encoder:
Let's say that now anaconda In the container ,ctc_encoder The project in
/home
Under the table of contents , Get into swig After folder , functionbash setup.sh
You can compile ( It needs to be done firstapt install gcc
,apt install g++
)To configure onnx,onnxruntime Environmental Science
pip install onnx==1.9.0 -i https://pypi.tuna.tsinghua.edu.cn/simple pip install onnxruntime==1.9.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
take Docker The runtime container is packaged as an image
take anaconda The environment of the container runtime is packaged into an image , to pycharm Professional Edition calls , Reference resources Pycharm Use docker Environment development in container
#OPTIONS explain : # -a : The author of the image submitted ; # -c : Use Dockerfile Command to create a mirror ; # -m : Instructions for submission ; # -p : stay commit when , Pause container . #2b1ad7022d19 by anaconda When the container is running id docker commit -a "wangxiaoxi" -m "wenet_env" 2b1ad7022d19 wenet_env:v1
2、 model training
Reference resources Tutorial on AIShell
3、 be based on libTorch Model reasoning
download aishell2 sample The dataset goes on wenet Model reasoning , The official website is as follows : Hill shell
download WeNet Pre training model of ( download Checkpoint model - conformer)
Put the test data set and pre training model under the project path , such as :
modify train.yaml
Medium cmvn_file
The location of ( If you use docker In container python Environmental Science , Relative paths are recommended )
cmvn_file: ../../test/aishell2/global_cmvn # Use relative path here
take aishell2 The dataset is modified to wenet data format
{
"key": "D4_753", "wav": "../../test/aishell2/test_data/D4_750.wav", "txt": ""}
{
"key": "D4_754", "wav": "../../test/aishell2/test_data/D4_751.wav", "txt": ""}
{
"key": "D4_755", "wav": "../../test/aishell2/test_data/D4_752.wav", "txt": ""}
{
"key": "D4_753", "wav": "../../test/aishell2/test_data/D4_753.wav", "txt": ""}
{
"key": "D4_754", "wav": "../../test/aishell2/test_data/D4_754.wav", "txt": ""}
{
"key": "D4_755", "wav": "../../test/aishell2/test_data/D4_755.wav", "txt": ""}
{
"key": "D4_756", "wav": "../../test/aishell2/test_data/D4_756.wav", "txt": ""}
Use wenet/bin/recognize.py, Enter the following command
python recognize
--config=../../test/aishell2/train.yaml \
--dict=../../test/aishell2/units.txt \
--checkpoint=../../test/aishell2/final.pt \
--result_file=../../test/aishell2/att_res_result.txt \
--test_data=../../test/aishell2/test_data/data.list \
The output is as follows :
Namespace(batch_size=16, beam_size=10, bpe_model=None, checkpoint='../../test/aishell2/final.pt', config='../../test/aishell2/train.yaml', connect_symbol='', ctc_weight=0.0, data_type='raw', decoding_chunk_size=-1, dict='../../test/aishell2/units.txt', gpu=-1, mode='attention', non_lang_syms=None, num_decoding_left_chunks=-1, override_config=[], penalty=0.0, result_file='../../test/aishell2/att_res_result.txt', reverse_weight=0.0, simulate_streaming=False, test_data='../../test/aishell2/test_data/data1.list')
2022-07-04 15:54:22,441 INFO Checkpoint: loading from checkpoint ../../test/aishell2/final.pt for CPU
F:\ASR\wenet\wenet\transformer\asr_model.py:266: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
best_hyps_index = best_k_index // beam_size
2022-07-04 15:54:27,189 INFO D4_753 Minning Marketing Service Department of people's insurance group of China
2022-07-04 15:54:27,189 INFO D4_755 Chinatelecom minning town cooperation business office
2022-07-04 15:54:27,189 INFO D4_754 Minning Town Health Center
2022-07-04 15:54:27,189 INFO D4_756 Minning town passenger station
2022-07-04 15:54:27,189 INFO D4_753 Episode 61
2022-07-04 15:54:27,189 INFO D4_755 Episode 63
2022-07-04 15:54:27,189 INFO D4_754 Episode 62
4、WeNet export onnx Model
Reference resources ONNX backend on WeNet
Download here first WeNet Pre training model of ( download Checkpoint model - conformer), Then use wenet/bin/export_onnx_cpu.py
, Set the following parameters , Can be libtorch Of pt File conversion to onnx file
python export_onnx_cpu.py
--config F:/ASR/model/20210618_u2pp_conformer_libtorch_aishell2/train.yaml \
--checkpoint F:/ASR/model/20210618_u2pp_conformer_libtorch_aishell2/final.pt \
--chunk_size 16 \
--output_dir F:/ASR/model/20210618_u2pp_conformer_libtorch_aishell2/onnx_dir \
--num_decoding_left_chunks -1
If onnx Export succeeded , The following will be generated in the output folder 3 File :encoder.onnx,ctc.onnx, decoder.onnx
.
5、 Use recognize_onnx
Reasoning ( Unresolved )
Reference resources https://github.com/wenet-e2e/wenet/pull/761.
To download conformer Weight file for model (checkpoint model),https://wenet.org.cn/wenet/pretrained_models.html
After decompressing the weight file , The folder directory is as follows
modify train.yaml
Medium cmvn_file
The location of
#cmvn_file: F:/ASR/model/20210618_u2pp_conformer_libtorch_aishell2/global_cmvn
cmvn_file: ../../test/aishell2/global_cmvn # Use relative path here
convert to wenet Of json data format : Suppose there are audio files D4_750.wav
, The format is converted as follows json Format , Reference resources https://wenet.org.cn/wenet/tutorial_librispeech.html?highlight=test_data#stage-0-prepare-training-data
{
"key": "D4_753", "wav": "../../test/aishell2/test_data/D4_750.wav", "txt": " The purchase restriction that has the greatest inhibitory effect on the transaction of the real estate market "}
Then run :
python3 wenet/bin/recognize_onnx.py --config=20210618_u2pp_conformer_exp/train.yaml --test_data=raw_wav/test/data.list --gpu=0 --dict=20210618_u2pp_conformer_exp/words.txt --mode=attention_rescoring --reverse_weight=0.4 --ctc_weight=0.1 --result_file=./att_res_result.txt --encoder_onnx=onnx_model/encoder.onnx --decoder_onnx=onnx_model/decoder.onnx
Be careful It is best to use relative paths here , Because it uses docker Inside python Environmental Science , If you use windows The absolute path under , This will cause the following errors . Solution ideas refer to https://github.com/microsoft/onnxruntime/issues/8735( Anyway, I can't solve it )
{
FileNotFoundError}[Errno 2] No such file or directory: 'F:/ASR/model/20210618_u2pp_conformer_libtorch_aishell2/train.yaml'
Use here export_onnx_cpu
Derived onnx Model , Use recognize_onnx
Reasoning
encoder_ort_session=onnxruntime.InferenceSession(encoder_outpath, providers=['CPUExecutionProvider']);
ort_inputs = {
encoder_ort_session.get_inputs()[0].name: feats.astype('float32'),
encoder_ort_session.get_inputs()[1].name: feats_lengths.astype('int64'),
encoder_ort_session.get_inputs()[2].name: np.zeros((12,4,0,128)).astype('float32'),
encoder_ort_session.get_inputs()[3].name: np.zeros((12,1,256,7)).astype('float32')
}
encoder_ort_session.run(None, ort_inputs)
Will throw an error
{
Fail}[ONNXRuntimeError] : 1 : FAIL : Non-zero status code returned while running Slice node. Name:'Slice_49' Status Message: slice.cc:153 FillVectorsFromInput Starts must be a 1-D array
Should be cuda and onnxruntime Version inconsistency results in , Reference resources OnnxRunTime encounter FAIL : Non-zero status code returned while running BatchNormalization node.
It turned out recognize_onnx
It's right export_onnx_gpu.py
Infer from the derived model , instead of export_onnx_cpu.py
. To use export_onnx_gpu.py
Still have to install nividia_docker
and onnxruntime_gpu
, Otherwise, an error will be reported :
/opt/conda/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:53: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'CPUExecutionProvider'
warnings.warn("Specified provider '{}' is not in available provider names."
Traceback (most recent call last):
File "/opt/project/wenet/bin/export_onnx_gpu.py", line 574, in <module>
onnx_config = export_enc_func(model, configs, args, logger, encoder_onnx_path)
File "/opt/project/wenet/bin/export_onnx_gpu.py", line 334, in export_offline_encoder
test(to_numpy([o0, o1, o2, o3, o4]), ort_outs)
NameError: name 'test' is not defined
It won't take this effort here , etc. wenet Perfect the project .
边栏推荐
- PHP reads the INI file and writes the modified content
- TPG x AIDU|AI领军人才招募计划进行中!
- 机器学习 --- 决策树
- 2022-2028 global and Chinese equipment as a Service Market Research Report
- The development of mobile IM based on TCP still needs to keep the heartbeat alive
- Here comes the Lantern Festival red envelope!
- Scheduling system of kubernetes cluster
- CSDN正文自动生成目录
- [illusory engine UE] method to realize close-range rotation of operating objects under fuzzy background and pit recording
- Un réveil de l'application B devrait être rapide
猜你喜欢
[illusory engine UE] method to realize close-range rotation of operating objects under fuzzy background and pit recording
【thingsboard】替换首页logo的方法
Aperçu en direct | Services de conteneurs ACK flexible Prediction Best Practices
Advanced length of redis -- deletion strategy, master-slave replication, sentinel mode
Longyuan war "epidemic" 2021 network security competition web easyjaba
Interview related high-frequency algorithm test site 3
mysql的七种join连接查询
【虛幻引擎UE】實現UE5像素流部署僅需六步操作少走彎路!(4.26和4.27原理類似)
NetSetMan pro (IP fast switching tool) official Chinese version v5.1.0 | computer IP switching software download
[phantom engine UE] realize the animation production of mapping tripod deployment
随机推荐
MacBook installation postgresql+postgis
Here comes the Lantern Festival red envelope!
Ctfshow 2022 Spring Festival welcome (detailed commentary)
Rome chain analysis
Threejs Internet of things, 3D visualization of farms (II)
TPG x AIDU | AI leading talent recruitment plan in progress!
【FineBI】使用FineBI制作自定义地图过程
电源管理总线 (PMBus)
3 minutes learn to create Google account and email detailed tutorial!
Function (error prone)
You Li takes you to talk about C language 7 (define constants and macros)
Discussion on the dimension of confrontation subspace
FFmepg使用指南
Open graph protocol
PR video clip (project packaging)
Fonction (sujette aux erreurs)
Hypothesis testing -- learning notes of Chapter 8 of probability theory and mathematical statistics
Is there a sudden failure on the line? How to make emergency diagnosis, troubleshooting and recovery
Burpsuite grabs app packets
【科普】热设计基础知识:5G光器件之散热分析