当前位置:网站首页>The latest progress and development trend of 2022 intelligent voice technology
The latest progress and development trend of 2022 intelligent voice technology
2022-07-02 09:45:00 【kaiyuan_ sjtu】
Learn in depth 、 Driven by big data and big computing power , Speech enhancement 、 Intelligent speech technology represented by recognition and synthesis has been applied in many applications . I've compiled some cutting-edge reports for you , Wen Wuke Free access .
No.1
New progress and development trend of intelligent voice technology
Speaker : Xie Lei
Professor of Northwestern Polytechnic University , With concurrent
Head of the audio speech and Language Processing Laboratory of West University of Technology
Abstract :
This report will combine the recent research results of the audio speech and language processing research group of Western Polytechnic University with the development status of intelligent speech technology , Focus on speech enhancement 、 Recent advances in recognition and synthesis . At the same time, with the continuous expansion of scenarios and Applications , Challenges of intelligent voice technology and prospects for future development .
No.2
Research progress of end-to-end sound source separation
Speaker : Luo Yi
PhD student at Neural acoustic processing lab (Naplab),Columbia University.
Abstract :
Recent progress in deep learning methods for the task of source separation have significantly advanced the state-of-the-art.
Among all the recent proposals, end-to-end systems that take waveform as input and directly generate waveforms have shown their advantage on both the system performance and the flexibility. In this talk, I will briefly go through some of the recent advances in the problem of end-to-end neural source separation. I will start with the general problem definition of source separation, then introduce several single-channel and multi-channel approaches, and conclude with the challenges and future works in this area.
Scan the code to get all the reports for free
↓↓↓
No.3
Multi speaker segmentation and clustering based on deep learning
Speaker : juck
University of Cambridge Research Associate
JD technical advisor
Abstract :
This open class first introduces the traditional multi speaker segmentation and clustering system of Cambridge University , The system has obtained ASRU 2015 MGB The champion of the speaker segmentation clustering task in the challenge , Then it introduces some work of the team recently using deep neural network to segment different parts of the clustering system . Finally, it also includes the discussion of some hot issues in the research of multi speaker segmentation and clustering , Including how to achieve a complete end-to-end neural network ( Trainable ) System and how to integrate segmentation and clustering with speech separation and recognition .
No.4
Sound event detection under weak tagging
Speaker : Wang Yun
Facebook The artificial intelligence application research group studies scientists
Carnegie Mellon University (CMU) Institute of computer technology (LTI) Doctor
Abstract :
Sound event detection (sound event detection), It refers to the detection of gunfire in the audio 、 Dog barking and other events , And mark their start and end time . Because it is troublesome to manually standard the start and end time for training data , Therefore, the actual training data is often only weakly labeled —— Only the event type contained in each sound is marked , But the starting and ending time is not marked . This lecture discusses how to use 「 Learn from various examples 」(multiple instance learning) Method , Using weak labeled data to train sound event detection system , The key is how to select the aggregate function , Maintain the balance between false detection and missed detection . The experience gained from this lecture , It can also be used for reference 「 Learn from various examples 」 In the task of .
No.5
Intelligent voice development status and data set introduction
Speaker : Chen Guoguo
SEASALT.AI cofounder
Dr. Johns Hopkins University
Abstract :
Share and discuss the current problems in the voice field , example : When intelligent voice is landing on the embedded device , Compared with the server side, what factors need to be considered ; At the same time, combine their own scientific research and entrepreneurial experience to scientific research colleagues 、 Students in school 、 Some practical suggestions , Let's avoid detours !
No.6
Research progress of accent and dialect speech recognition
Speaker : Tangzhiyuan
Dr. Lian Pei, Chinese Academy of Sciences and Tsinghua University
Tsinghua postdoctoral
Abstract :
Speech recognition technology has been widely used in daily life , However, its performance or experience in accent or dialect is still not satisfactory . This report gives a quick review of the research progress of accent and dialect speech recognition in recent years , And further introduces the data related to accent or dialect speech recognition 、 Benchmarks and competitions , And some feasible research directions .
Scan the code to get all the reports for free
↓↓↓
边栏推荐
- Beats (filebeat, metricbeat), kibana, logstack tutorial of elastic stack
- In SQL injection, why must the ID of union joint query be equal to 0
- Hystrix implements request consolidation
- PI control of three-phase grid connected inverter - off grid mode
- ZK configuration center -- configuration and use of config Toolkit
- 上班第一天的报错(Nessus安装winpcap报错)
- Microservice practice | Eureka registration center and cluster construction
- Mysql 多列IN操作
- BugkuCTF-web21(详细解题思路及步骤)
- Judging right triangle in C language
猜你喜欢
web安全与防御
2837xd代码生成模块学习(4)——idle_task、Simulink Coder
Web security and defense
Personal experience & blog status
自定義Redis連接池
JDBC review
2837xd 代码生成——补充(3)
FragmentTabHost实现房贷计算器界面
Failed to configure a DataSource: ‘url‘ attribute is not specified and no embedd
Error reporting on the first day of work (incomplete awvs unloading)
随机推荐
2837xd 代码生成——StateFlow(2)
What is the function of laravel facade
How to install PHP in CentOS
cmake的命令-官方文档
2837xd 代码生成——总结篇
Ckeditor 4.10.1 upload pictures to prompt "incorrect server response" problem solution
tinyxml2 读取和修改文件
2837xd 代码生成——补充(1)
2837xd code generation module learning (4) -- idle_ task、Simulink Coder
Alibaba / popular JSON parsing open source project fastjson2
BugkuCTF-web16(备份是个好习惯)
Redis 序列化 GenericJackson2JsonRedisSerializer和Jackson2JsonRedisSerializer的区别
What are the waiting methods of selenium
Difference between redis serialization genericjackson2jsonredisserializer and jackson2jsonredisserializer
Off grid control of three-phase inverter - PR control
2837xd Code Generation - Supplement (1)
在SQL注入中,为什么union联合查询,id必须等于0
Enterprise level SaaS CRM implementation
C language strawberry
2837xd 代码生成——StateFlow(1)