当前位置:网站首页>The latest progress and development trend of 2022 intelligent voice technology
The latest progress and development trend of 2022 intelligent voice technology
2022-07-02 09:45:00 【kaiyuan_ sjtu】
Learn in depth 、 Driven by big data and big computing power , Speech enhancement 、 Intelligent speech technology represented by recognition and synthesis has been applied in many applications . I've compiled some cutting-edge reports for you , Wen Wuke Free access .
No.1
New progress and development trend of intelligent voice technology
Speaker : Xie Lei
Professor of Northwestern Polytechnic University , With concurrent
Head of the audio speech and Language Processing Laboratory of West University of Technology
Abstract :
This report will combine the recent research results of the audio speech and language processing research group of Western Polytechnic University with the development status of intelligent speech technology , Focus on speech enhancement 、 Recent advances in recognition and synthesis . At the same time, with the continuous expansion of scenarios and Applications , Challenges of intelligent voice technology and prospects for future development .
No.2
Research progress of end-to-end sound source separation
Speaker : Luo Yi
PhD student at Neural acoustic processing lab (Naplab),Columbia University.
Abstract :
Recent progress in deep learning methods for the task of source separation have significantly advanced the state-of-the-art.
Among all the recent proposals, end-to-end systems that take waveform as input and directly generate waveforms have shown their advantage on both the system performance and the flexibility. In this talk, I will briefly go through some of the recent advances in the problem of end-to-end neural source separation. I will start with the general problem definition of source separation, then introduce several single-channel and multi-channel approaches, and conclude with the challenges and future works in this area.
Scan the code to get all the reports for free
↓↓↓

No.3
Multi speaker segmentation and clustering based on deep learning
Speaker : juck
University of Cambridge Research Associate
JD technical advisor
Abstract :
This open class first introduces the traditional multi speaker segmentation and clustering system of Cambridge University , The system has obtained ASRU 2015 MGB The champion of the speaker segmentation clustering task in the challenge , Then it introduces some work of the team recently using deep neural network to segment different parts of the clustering system . Finally, it also includes the discussion of some hot issues in the research of multi speaker segmentation and clustering , Including how to achieve a complete end-to-end neural network ( Trainable ) System and how to integrate segmentation and clustering with speech separation and recognition .
No.4
Sound event detection under weak tagging
Speaker : Wang Yun
Facebook The artificial intelligence application research group studies scientists
Carnegie Mellon University (CMU) Institute of computer technology (LTI) Doctor
Abstract :
Sound event detection (sound event detection), It refers to the detection of gunfire in the audio 、 Dog barking and other events , And mark their start and end time . Because it is troublesome to manually standard the start and end time for training data , Therefore, the actual training data is often only weakly labeled —— Only the event type contained in each sound is marked , But the starting and ending time is not marked . This lecture discusses how to use 「 Learn from various examples 」(multiple instance learning) Method , Using weak labeled data to train sound event detection system , The key is how to select the aggregate function , Maintain the balance between false detection and missed detection . The experience gained from this lecture , It can also be used for reference 「 Learn from various examples 」 In the task of .
No.5
Intelligent voice development status and data set introduction
Speaker : Chen Guoguo
SEASALT.AI cofounder
Dr. Johns Hopkins University
Abstract :
Share and discuss the current problems in the voice field , example : When intelligent voice is landing on the embedded device , Compared with the server side, what factors need to be considered ; At the same time, combine their own scientific research and entrepreneurial experience to scientific research colleagues 、 Students in school 、 Some practical suggestions , Let's avoid detours !
No.6
Research progress of accent and dialect speech recognition
Speaker : Tangzhiyuan
Dr. Lian Pei, Chinese Academy of Sciences and Tsinghua University
Tsinghua postdoctoral
Abstract :
Speech recognition technology has been widely used in daily life , However, its performance or experience in accent or dialect is still not satisfactory . This report gives a quick review of the research progress of accent and dialect speech recognition in recent years , And further introduces the data related to accent or dialect speech recognition 、 Benchmarks and competitions , And some feasible research directions .
Scan the code to get all the reports for free
↓↓↓

边栏推荐
- 因上努力,果上随缘
- TD conducts functional simulation with Modelsim
- Navicat 远程连接Mysql报错1045 - Access denied for user ‘root‘@‘222.173.220.236‘ (using password: YES)
- Required request body is missing:(跨域问题)
- 分享一篇博客(水一篇博客)
- 图像识别-数据增广
- int与string、int与QString互转
- YOLO物体识别,生成数据用到的工具
- VIM operation command Encyclopedia
- Operation and application of stack and queue
猜你喜欢

Required request body is missing:(跨域问题)

Customize redis connection pool

Enterprise level SaaS CRM implementation

Learn combinelatest through a practical example

Supplier selection and prequalification of Oracle project management system

个人经历&&博客现状

Save video opencv:: videowriter

2837xd code generation - stateflow (1)

2837xd code generation - Summary

Typora安装包分享
随机推荐
Read Day6 30 minutes before going to bed every day_ Day6_ Date_ Calendar_ LocalDate_ TimeStamp_ LocalTime
Cmake command - Official Document
Matlab生成dsp程序——官方例程学习(6)
2837xd Code Generation - Supplement (1)
Mysql默认事务隔离级别及行锁
Idempotent design of Internet API interface
记录下对游戏主机配置的个人理解与心得
JDBC回顾
2837xd code generation - stateflow (3)
Difference between redis serialization genericjackson2jsonredisserializer and jackson2jsonredisserializer
Image recognition - data annotation
Failed to configure a DataSource: ‘url‘ attribute is not specified and no embedd
MySQL error: unblock with mysqladmin flush hosts
每天睡觉前30分钟阅读_day3_Files
2837xd code generation - stateflow (4)
Fragmenttabhost implements the interface of housing loan calculator
In SQL injection, why must the ID of union joint query be equal to 0
上班第一天的报错(Nessus安装winpcap报错)
QT signal slot summary -connect function incorrect usage
Bugkuctf-web24 (problem solving ideas and steps)