当前位置:网站首页>Speech Synthesis Model Cheat Sheet (1)
Speech Synthesis Model Cheat Sheet (1)
2022-08-03 01:32:00 【Andy Dennis】
Foreword
Voice is also an increasingly popular industry.Given a piece of text, we want it to be read. We need to use speech synthesis technology, which is Text-to-Speech, or TTS for short.Here are some interesting models I saw.
One-stage speech synthesis is generally called end-to-end
Two-stage speech synthesis step, usually stage1:
Text-(FFT)-> Spectrogram-(filtering)-> 梅尔谱/线性谱
stage 2: 将梅尔谱/线性谱生成波形(音频)
Thesis
VITS
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
ICML 2021
Paper: https://arxiv.org/abs/2106.06103
Code: https://github.com/jaywalnut310/vits

condition VAE + flow + GAN
flow can look at the two articles v-flow and flow++.
I saw two paper notes on Zhihu:
More detailed Read the classic: VITS, for speech synthesis tapeConditional Variational Autoencoders with Adversarial Learning
Short [Paper Notes] VITS_OlaWod
边栏推荐
- GameStop NFT 市场分析
- airflow db init 报错
- 反弹shell原理与实现
- 【代码扫描修复】MD5加密弱HASH漏洞
- 函数:计算组合数
- 思源笔记 本地存储无使用第三方同步盘,突然打不开文件。
- 2022第十一届财经峰会:优炫软件斩获双项大奖
- Test | ali internship 90 days in life: from the perspective of interns, talk about personal growth
- MySQL 用id分库使用name查询
- The latest real software test interview questions are shared. Are you afraid that you will not be able to enter the big factory after collecting them?
猜你喜欢

程序员的七夕浪漫时刻

如何使用vlookup+excel数组公式 完成逆向查找?

Directing a non-relational database introduction and deployment

合并两个excel表格工具

Shunted Self-Attention via Multi-Scale Token Aggregation

创建型模式 - 抽象工厂模式AbstractFactory

Week 7 CNN Architectures - LeNet-5、AlexNet、VGGNet、GoogLeNet、ResNet

2022暑假牛客多校1 (A/G/D/I)

vscode 自定义快捷键——设置eslint

ROS2初级知识(9):bag记录过程数据和重放
随机推荐
WebShell 木马免杀过WAF
TDengine 在中天钢铁 GPS、 AIS 调度中的落地
反弹shell原理与实现
如何通过开源数据库管理工具 DBeaver 连接 TDengine
严格反馈非线性系统基于事件触发的自抗扰预设有限时间跟踪控制
最近公共祖先(LCA)学习笔记 | P3379 【模板】最近公共祖先(LCA)题解
记一次mysql查询慢的优化历程
Pytest配置项-pytest.ini
C语言函数详解(2)【函数参数——实际参数(实参)&形式参数(形参)】
Cholesterol-PEG-Amine,CLS-PEG-NH2,胆固醇-聚乙二醇-氨基脂两亲性脂质衍生物
停止使用 Storyboards 和 Interface Builder
最近公共祖先(LCA)学习笔记 | P3379 【模板】最近公共祖先(LCA)题解
2022杭电多校第一场(K/L/B/C)
resubmit 渐进式防重复提交框架简介
无代码开发平台数据ID入门教程
qt静态编译出现Project ERROR: Library ‘odbc‘ is not defined
VMware workstation 程序启动慢
MySQL删除数据后,释放磁盘空间
2022暑假牛客多校1 (A/G/D/I)
Jmeter二次开发实现rsa加密