当前位置：网站首页>Speech recognition (ASR) paper selection: talcs: an open source Mandarin English code switching corps and a speech

Speech recognition (ASR) paper selection: talcs: an open source Mandarin English code switching corps and a speech

2022-07-06 19:49:00 【My name is Yongqiang】

Statement ： Usually read some articles, take some notes and share them , There are inevitably mistakes in the article , I hope you will have a better understanding of Haihan . Collect some information , It's easy to check and learn ：http://yqli.tech/page/speech.html. For a list of papers in the field of speech synthesis, please visit http://yqli.tech/page/tts_paper.html, For the statistics of papers in the field of speech recognition, please visit http://yqli.tech/page/asr_paper.html. How to find voice information, please refer to the article https://mp.weixin.qq.com/s/eJcpsfs3OuhrccJ7_BvKOg）. If reproduced , Please indicate the source . Welcome to WeChat official account. ： Keep a low profile .

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline

This article is tal in 2022.06.27 Updated articles , Mainly open source the largest Chinese English mixed training corpus , For speech recognition Code-switching Contribute to research .

（ Open source data statistics can be found in http://yqli.tech/page/data.html）

Because the main work of this paper is to open source the world's largest Chinese English mixed data , We will not introduce the background , View the data set directly . This data set is the audio of Tal English class , Including mixed Chinese and English speech , There is only one speaker per audio , The dataset has 100 More speakers .（ file 63.36G） The data includes the following figure 1 Examples of intrasentence and inter sentence mixing shown . The ratio between Chinese characters and English words in this data is 13:1, among top 20 Pictured 2 Shown .table 1 It shows the division of the training set and test set of the corpus ,table 2 Show how to use this data set in espnet and wenet The results of the experiment on .

Data scale	587 Hour audio
Sampling rate	16KHz
Sampling bit sound	16bit
Recording devices	Ordinary microphone
The speaker	200+
Recording time	2019 year
data format	Audio ：.wav; Mark the results ：.txt
Audio length	1～60s
data type	English teacher's audio