当前位置:网站首页>Voiceprint Technology (V): voiceprint segmentation and clustering technology
Voiceprint Technology (V): voiceprint segmentation and clustering technology
2022-06-25 09:05:00 【u013250861】
5.1 Segmentation clustering : Better understand the voice of dialogue
5.1.1 About name and history
Voiceprint segmentation clustering (speaker diarization) It is second only to voiceprint recognition in the field of voiceprint , It is much more difficult than voiceprint recognition . The problems solved by voiceprint recognition can be summarized as ——“ Who said that ”, This includes a hypothesis , That is, the known speech to be recognized , There is only one speaker's voice . In the voiceprint segmentation and clustering problem , We have overturned this assumption , in other words , A speech can contain the voice of multiple speakers speaking alternately . therefore , The problems solved by voiceprint segmentation and clustering can be summarized as ——“ Who said it at what time ”(who spoke when).
In English diarization The word" , From words diary, That is, diary or diary . from diary To verb diarize, And then to nouns diarization, Literally , It can be understood as “ send …… Become a log ”, Or say “ Log ”. generally speaking , A journal is usually recorded in the time of the day , Who did what at what time . Then it is extended to speaker diarization, Naturally, it can be understood as “ Who said what at what time ”.
About speaker diarization The earliest origin of this name , It is difficult to study . Some early literature directly referred to this problem as speaker segmentation and clustering [114,115], This is why many Chinese documents translate it into “ Voiceprint segmentation clustering ”[116]. But with the development of this field , Especially in recent years, the supervised method ( see 5.5 section ) Even end-to-end models ( see 5.5.6 section ) Appearance ,“ Segmentation clustering ” The name is no longer appropriate . Whether it is segmentation or clustering , Can be replaced by other methods . Another Chinese translation that I prefer is “ Voiceprint time sharing archive ”
边栏推荐
- 《乔布斯传》英文原著重点词汇笔记(三)【 chapter one】
- 【OpenCV】—输入输出XML和YAML文件
- 在华泰证券上面开户好不好,安不安全?
- Make a skylearn high-dimensional dataset_ Circles and make_ moons
- Oracle one line function Encyclopedia
- C language: find all integers that can divide y and are odd numbers, and put them in the array indicated by B in the order from small to large
- 1、 Construction of single neural network
- Mapping mode of cache
- 关掉一个线程
- Voiceprint Technology (II): Fundamentals of audio signal processing
猜你喜欢

Webgl Google prompt memory out of bounds (runtimeerror:memory access out of bounds, Firefox prompt index out of bounds)

Unity--configurable joint -- a simple tutorial to get you started with configurable joints

四、卷积神经网络(Convolution Neural Networks)

微服务调用组件Ribbon底层调用流程分析

C language: bubble sort
![[opencv] - input and output XML and yaml files](/img/4e/7944e205c71246d0b0e3747eefca37.png)
[opencv] - input and output XML and yaml files

Close a thread

How to increase the monthly salary of software testing from 10K to 30K? Only automated testing can do it

Oracle one line function Encyclopedia

行业春寒回暖,持续承压的酒店企业于何处破局?
随机推荐
Format analysis and explanation of wav file
matplotlib matplotlib中axvline()和axhline()函数
atguigu----18-组件
三、自动终止训练
Easyplayer streaming media player plays HLS video. Technical optimization of slow starting speed
Are the top ten securities companies at great risk of opening accounts and safe and reliable?
Atguigu---01-scaffold
备战2022年金九银十必问的1000道Android面试题及答案整理,彻底解决面试的烦恼
How safe is the new bond
Atguigu---17-life cycle
《乔布斯传》英文原著重点词汇笔记(五)【 chapter three 】
云网络技术的好处以及类型
Atguigu---18-component
flutter 获取顶部状态栏的高度
五、项目实战---识别人和马
Notes on key vocabulary of the original English work biography of jobs (I) [introduction]
声纹技术(三):声纹识别技术
Prepare for the 1000 Android interview questions and answers that golden nine silver ten must ask in 2022, and completely solve the interview problems
4、 Convolution neural networks
Object.defineProperty也能监听数组变化?