当前位置:网站首页>Speech recognition (ASR) paper selection: talcs: an open source Mandarin English code switching corps and a speech

Speech recognition (ASR) paper selection: talcs: an open source Mandarin English code switching corps and a speech

2022-07-06 19:49:00 My name is Yongqiang

Statement : Usually read some articles, take some notes and share them , There are inevitably mistakes in the article , I hope you will have a better understanding of Haihan . Collect some information , It's easy to check and learn :http://yqli.tech/page/speech.html. For a list of papers in the field of speech synthesis, please visit http://yqli.tech/page/tts_paper.html, For the statistics of papers in the field of speech recognition, please visit http://yqli.tech/page/asr_paper.html. How to find voice information, please refer to the article https://mp.weixin.qq.com/s/eJcpsfs3OuhrccJ7_BvKOg). If reproduced , Please indicate the source . Welcome to WeChat official account. : Keep a low profile .

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline

This article is tal in 2022.06.27 Updated articles , Mainly open source the largest Chinese English mixed training corpus , For speech recognition Code-switching Contribute to research .


( Open source data statistics can be found in http://yqli.tech/page/data.html)

Because the main work of this paper is to open source the world's largest Chinese English mixed data , We will not introduce the background , View the data set directly ​. This data set is the audio of Tal English class , Including mixed Chinese and English speech , There is only one speaker per audio , The dataset has 100 More speakers .( file 63.36G) The data includes the following figure 1 Examples of intrasentence and inter sentence mixing shown . The ratio between Chinese characters and English words in this data is 13:1, among top 20 Pictured ​2 Shown .table 1 It shows the division of the training set and test set of the corpus ,table 2 Show how to use this data set in espnet and wenet The results of the experiment on .

Data scale 587 Hour audio
Sampling rate 16KHz
Sampling bit sound 16bit
Recording devices Ordinary microphone
The speaker 200+
Recording time 2019 year
data format Audio :.wav; Mark the results :.txt
Audio length 1~60s
data type English teacher's audio

原网站

版权声明
本文为[My name is Yongqiang]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207061148030573.html