当前位置:网站首页>About vctk datasets
About vctk datasets
2022-07-01 01:19:00 【Wsyoneself】
- download vctk Data sets ( Download path :https://datashare.ed.ac.uk/download/DS_10283_3443.zip)
- vctk Data set understanding :
- CSTR VCTK The corpus includes 110 Voice data from English speakers using different accents . Each speaker reads about 400 A sentence , These sentences are from a newspaper 、rainbow The article and an inspirational paragraph for voice stress files .
- The text is selected according to the greedy algorithm , Greedy algorithm can increase context and voice coverage .
- All voice data are recorded using the same recording settings : An omnidirectional microphone (DPA 4035) And a small diaphragm condenser microphone , The bandwidth is very wide (Sennheiser MKH 800), Sampling frequency is 96kHz,24 position , Located in the semi anechoic room of the University of Edinburgh .
- All records are converted to 16 position , Downsampling to 48 kHz
- This corpus was originally used to base on HMM Text to speech synthesis system , Especially based on speaker adaptation HMM Voice synthesis of , The synthesis uses the average speech model of multiple speakers and speaker adaptation technology . The corpus is also applicable to DNN Multi spoken human language synthesis system and waveform modeling .** The idea here and PCA The idea of extracting face features and averaging faces to synthesize a given face is similar **
- VCTK There are several variants of corpus :
- Voice enhancement : For training speech enhancement algorithms and TTS Model noise speech database , Audio is artificially directed to VCTK Various types of noise are added :http://dx.doi.org/10.7488/ds/2117
- Reverberation voice database , Used to train speech de reverberation algorithm and TTS Model ,VCTK Various types of reverberation have been artificially added in http://dx.doi.org/10.7488/ds/1425
- For training speech enhancement algorithms and TTS Model noise reverberation speech database http://dx.doi.org/10.7488/ds/2139
- Equipment records VCTK, among VCTK The speech signal of the corpus is played back , And use relatively cheap consumer equipment to re record in the office environment http://dx.doi.org/10.7488/ds/2316
- Microsoft Scalable noisy speech data set (MS-SNSD)https://github.com/microsoft/MS-SNSD
- ASV And anti deception :
- Deception and anti deception (SAS) corpus , It is a collection of synthetic speech signals produced by nine technologies , Two of them are speech synthesis , Seven are voice conversion . All of these are using VCTK Corpus construction .http://dx.doi.org/10.7488/ds/252
- Automated speaker verification deception and countermeasure challenges (ASVspoof 2015) database . The database is composed of synthetic speech signals generated by ten technologies , It has been used in the first automatic speaker verification deception and challenge confrontation (ASVspoof 2015)http://dx.doi.org/10.7488/ds/298
- ASVspoof 2019: The third automatic speaker verification deception and countermeasure challenge database . The database has been used for the third automatic speaker verification deception and countermeasure challenge (ASVspoof 2019)https://doi.org/10.7488/ds/2555
- To use the corpus, you need to add references :
Christophe Veaux, Junichi Yamagishi, Kirsten MacDonald, "CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit", The Centre for Speech Technology Research (CSTR), University of Edinbur
边栏推荐
- Is the public read-only field with immutable structure valid- Does using public readonly fields for immutable structs work?
- Open3D 点云颜色渲染
- K210 access control complete
- 2022 is half way through. It's hard to make money
- Using C language to realize the exchange between the contents of two arrays (provided that the array is the same size)
- uniapp官方组件点击item无效,解决方案
- The question of IBL precomputation is finally solved
- 关于Unity一般的输入操作方式
- 机器人编程的培训学科类原理
- [2023 MediaTek approved the test questions in advance] ~ questions and reference answers
猜你喜欢

Win11安装redis 数据库以及redis desktop manager的下载

Vnctf 2022 cm CM1 re reproduction

Dls-42/6-4 dc110v double position relay

Hoo research | coinwave production - nym: building the next generation privacy infrastructure

Analyze the maker education path integrating the essence of discipline

P4 learning - p4runtime

软硬件基础知识学习--小日记(1)

Technical personnel advanced to draw a big picture of business, hand-in-hand teaching is coming
![[问题已处理]-nvidia-smi命令获取不到自身容器的GPU进程和外部的GPU进程号](/img/51/e48e222c14f4a4e9f2be91a677033f.png)
[问题已处理]-nvidia-smi命令获取不到自身容器的GPU进程和外部的GPU进程号

HDU 2488 A Knight's Journey(DFS)
随机推荐
Install redis database and download redis Desktop Manager in win11
Flutter Error: Cannot run with sound null safety, because the following dependencies don‘t support
None of the following candidates is applicable because of a receiver type mismatch
StrictMode带来的思考-StrictMode原理(5)
Win11安装redis 数据库以及redis desktop manager的下载
DX-11Q信号继电器
Sword finger offer 19 Regular Expression Matching
Tcp/ip protocol stack, about TCP_ RST | TCP_ ACK correct attitude
K210工地安全帽
ASCII、Unicode、GBK、UTF-8之间的关系
[Deepin] 常用集合
The real topic of the 11th provincial competition of Bluebridge cup 2020 - crop hybridization
解决IDEA:Class ‘XXX‘ not found in module ‘XXX‘
[daily record] - bug encountered in BigDecimal division operation
Cmu15445 (fall 2019) project 1 - buffer pool details
Kongyiji's first question: how much do you know about service communication?
Gavin's insight on the transformer live broadcast course - rasa project's actual banking financial BOT Intelligent Business Dialogue robot system startup, language understanding, dialogue decision-mak
Parity linked list [two general directions of linked list operation]
PHP online confusion encryption tutorial sharing + basically no solution
Web compatibility testing of software testing