当前位置:网站首页>About vctk datasets
About vctk datasets
2022-07-01 01:19:00 【Wsyoneself】
- download vctk Data sets ( Download path :https://datashare.ed.ac.uk/download/DS_10283_3443.zip)
- vctk Data set understanding :
- CSTR VCTK The corpus includes 110 Voice data from English speakers using different accents . Each speaker reads about 400 A sentence , These sentences are from a newspaper 、rainbow The article and an inspirational paragraph for voice stress files .
- The text is selected according to the greedy algorithm , Greedy algorithm can increase context and voice coverage .
- All voice data are recorded using the same recording settings : An omnidirectional microphone (DPA 4035) And a small diaphragm condenser microphone , The bandwidth is very wide (Sennheiser MKH 800), Sampling frequency is 96kHz,24 position , Located in the semi anechoic room of the University of Edinburgh .
- All records are converted to 16 position , Downsampling to 48 kHz
- This corpus was originally used to base on HMM Text to speech synthesis system , Especially based on speaker adaptation HMM Voice synthesis of , The synthesis uses the average speech model of multiple speakers and speaker adaptation technology . The corpus is also applicable to DNN Multi spoken human language synthesis system and waveform modeling .** The idea here and PCA The idea of extracting face features and averaging faces to synthesize a given face is similar **
- VCTK There are several variants of corpus :
- Voice enhancement : For training speech enhancement algorithms and TTS Model noise speech database , Audio is artificially directed to VCTK Various types of noise are added :http://dx.doi.org/10.7488/ds/2117
- Reverberation voice database , Used to train speech de reverberation algorithm and TTS Model ,VCTK Various types of reverberation have been artificially added in http://dx.doi.org/10.7488/ds/1425
- For training speech enhancement algorithms and TTS Model noise reverberation speech database http://dx.doi.org/10.7488/ds/2139
- Equipment records VCTK, among VCTK The speech signal of the corpus is played back , And use relatively cheap consumer equipment to re record in the office environment http://dx.doi.org/10.7488/ds/2316
- Microsoft Scalable noisy speech data set (MS-SNSD)https://github.com/microsoft/MS-SNSD
- ASV And anti deception :
- Deception and anti deception (SAS) corpus , It is a collection of synthetic speech signals produced by nine technologies , Two of them are speech synthesis , Seven are voice conversion . All of these are using VCTK Corpus construction .http://dx.doi.org/10.7488/ds/252
- Automated speaker verification deception and countermeasure challenges (ASVspoof 2015) database . The database is composed of synthetic speech signals generated by ten technologies , It has been used in the first automatic speaker verification deception and challenge confrontation (ASVspoof 2015)http://dx.doi.org/10.7488/ds/298
- ASVspoof 2019: The third automatic speaker verification deception and countermeasure challenge database . The database has been used for the third automatic speaker verification deception and countermeasure challenge (ASVspoof 2019)https://doi.org/10.7488/ds/2555
- To use the corpus, you need to add references :
Christophe Veaux, Junichi Yamagishi, Kirsten MacDonald, "CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit", The Centre for Speech Technology Research (CSTR), University of Edinbur
边栏推荐
- A proper job is a good job
- 使用StrictMode-StrictMode原理(1)
- Gavin's insight on the transformer live broadcast course - rasa project's actual banking financial BOT Intelligent Business Dialogue robot system startup, language understanding, dialogue decision-mak
- 解读创客教育所蕴含的科技素养
- 解决IDEA:Class ‘XXX‘ not found in module ‘XXX‘
- ORB-SLAM2源码学习(二)地图初始化
- 蒹葭苍苍,白露为霜。
- How to do the performance pressure test of "Health Code"
- 2021电赛F题openmv和K210调用openmv api巡线,完全开源。
- Windows环境下安装MongoDB数据库
猜你喜欢

DLS-20型双位置继电器 220VDC

友盟(软件异常实时监听的好帮手:Crash)接入教程(有点基础的小白最易学的教程)

P4 learning - Basic tunneling

解读创客教育所蕴含的科技素养

Chapter 53 overall understanding of procedures from the perspective of business logic implementation

Sword finger offer 19 Regular Expression Matching

Technical personnel advanced to draw a big picture of business, hand-in-hand teaching is coming

双位置继电器ST2-2L/AC220V

What if the disk of datanode is full?

Implementation of date class
随机推荐
NE555 waveform generator handle tutorial NE555 internal structure (I)
Packing and unpacking of C #
The communication mechanism and extension of Supervisor
[network packet loss and network delay? This artifact can help you deal with everything!]
Training discipline principle of robot programming
Q play soft large toast to bring more comfortable sleep
StrictMode分析Activity泄漏-StrictMode原理(3)
[问题已处理]-nvidia-smi命令获取不到自身容器的GPU进程和外部的GPU进程号
Technical personnel advanced to draw a big picture of business, hand-in-hand teaching is coming
The longest selling mobile phone in China has been selling well since its launch, crushing iphone12
Usage of C set
The question of IBL precomputation is finally solved
uniapp官方组件点击item无效,解决方案
解读创客教育所蕴含的科技素养
关于Unity一般的输入操作方式
XJY-220/43AC220V静态信号继电器
Split the linked list [take next first and then cut the linked list to prevent chain breakage]
None of the following candidates is applicable because of a receiver type mismatch
Parity linked list [two general directions of linked list operation]
[daily record] - bug encountered in BigDecimal division operation