当前位置:网站首页>Speech Synthesis Model Cheat Sheet (1)
Speech Synthesis Model Cheat Sheet (1)
2022-08-03 01:32:00 【Andy Dennis】
Foreword
Voice is also an increasingly popular industry.Given a piece of text, we want it to be read. We need to use speech synthesis technology, which is Text-to-Speech, or TTS for short.Here are some interesting models I saw.
One-stage speech synthesis is generally called end-to-end
Two-stage speech synthesis step, usually stage1:
Text-(FFT)-> Spectrogram-(filtering)-> 梅尔谱/线性谱
stage 2: 将梅尔谱/线性谱生成波形(音频)
Thesis
VITS
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
ICML 2021
Paper: https://arxiv.org/abs/2106.06103
Code: https://github.com/jaywalnut310/vits

condition VAE + flow + GAN
flow can look at the two articles v-flow and flow++.
I saw two paper notes on Zhihu:
More detailed Read the classic: VITS, for speech synthesis tapeConditional Variational Autoencoders with Adversarial Learning
Short [Paper Notes] VITS_OlaWod
边栏推荐
- 创建型模式 - 抽象工厂模式AbstractFactory
- 目前为止 DAO靠什么盈利?
- 基于两级分解和长短时记忆网络的短期风速多步组合预测模型
- Week 7 - Distributional Representations(分布表示)
- 反弹shell原理与实现
- scala 集合通用方法
- IDO代币预售合约系统开发技术详细
- Strict feedback nonlinear systems based on event trigger preset since the immunity of finite time tracking control
- Numpy数组中d[True]=1的含义
- 创建型模式 - 简单工厂模式StaticFactoryMethod
猜你喜欢
精心整理16条MySQL使用规范,减少80%问题,推荐分享给团队
VS保存后Unity不刷新
在软件测试行业近20年的我,再来和大家谈谈今日的软件测试
Matplotlib drawing core principles explain (more detailed)
C语言函数详解(2)【函数参数——实际参数(实参)&形式参数(形参)】
The CTF command execution subject their thinking
【斯坦福计网CS144项目】Lab5: NetworkInterface
浅读一下dotenv的主干逻辑的源码
AcWing 2983. 玩具
Towards a General Purpose CNN for Long Range Dependencies in ND
随机推荐
Week 7 - Distributional Representations(分布表示)
数字化转型巨浪拍岸,成长型企业如何“渡河”?
HCIP(16)
Find My技术|智能防丢还得看苹果Find My技术
openssl源码下载
合并两个excel表格工具
Task 4 Machine Learning Library Scikit-learn
airflow db init 报错
CodeTON Round 2 A - D
Jmeter二次开发实现rsa加密
CTF命令执行题目解题思路
MYSQL查看表结构
FastCorrect:语音识别快速纠错模型丨RTC Dev Meetup
Rasa 3.x 学习系列- Rasa - Issues 4792 socket debug logs clog up debug feed学习笔记
centos7安装mysql5.7
Cholesterol-PEG-Amine,CLS-PEG-NH2,胆固醇-聚乙二醇-氨基脂两亲性脂质衍生物
Test | ali internship 90 days in life: from the perspective of interns, talk about personal growth
00 -- jieba分词
浅读一下dotenv的主干逻辑的源码
基于飞腾平台的嵌入式解决方案案例集 1.0 正式发布!