当前位置:网站首页>Jinglianwen technology provides voice data acquisition and labeling services

Jinglianwen technology provides voice data acquisition and labeling services

2022-06-13 06:40:00 Jinglianwen Technology

What is voice tagging ?

Voice annotation is a common annotation type in the data annotation industry , The annotator marks and transcribes the voice information continuously , Let the manual system learn further , The marked data is mainly used for artificial intelligence machine learning , It is equivalent to installing the computer system “ Ears ”, Make it have “ Can hear ” The function of , So that the computer can have accurate speech recognition ability .

Voice annotation mainly includes ASR Voice transcribe 、 Speech cutting 、 Voice cleaning 、 Cleaning judgment 、 Voiceprint recognition 、 Phoneme labeling 、 Prosody tagging 、 Pronunciation proofreading these eight common ways of marking .

Voice tagging is closely related to artificial intelligence , At present , Speech recognition technology has been popularized in many aspects of daily life , Such as voice assistant 、 Intelligent speakers 、 Intelligent customer service, etc . With the development of artificial intelligence , The human-computer voice interaction scene will extend in more directions , In recognition accuracy 、 Scene optimization 、 It puts forward higher requirements for speech recognition technology .

 AI The importance of data

In recent years, , Artificial intelligence continues to develop , The tool chain that enables AI is not perfect . Data is one of the core elements of AI iterative innovation , Optimize training data AI The model is an important way to further improve the accuracy . For advancement AI Apply high-quality landing , Basic data service providers of artificial intelligence need to collect data 、 cleaning 、 Information extraction 、 mark 、 Quality testing 、 Management and other links are more finely controlled , To provide higher quality data .

Jinglianwen technology provides data support for voice annotation

Jinglianwen technology is the largest enterprise in the Yangtze River Delta AI One of the basic data service providers , The existing database has a voice dataset 100T, Language readings covering tens of thousands of hours have been collected 、 Natural language conversation voice data , It can quickly provide data sets that meet the requirements . for example 《50800 Data set of recording and acquisition in depot 》、《60000 Segment Chinese voice data set 》、《100 individual id12000 A data set of Chinese reading English wake-up words 》、《21000 paragraph ASR Voice transcribe audio training set 》、《13000 Segment speech cutting audio training set 》 And other data sets that can be used to study the algorithms of speech recognition technology , It can effectively improve the test efficiency .

Jinglianwen technology has built a national 27 Provinces, cities and municipalities directly under the central government are all over the world 52 Data collection resource networks in countries , Rich in dialects , Collection channels for small languages 、 Scene building ability , Special scene data acquisition capability , Support speech recognition ASR collection 、 speech synthesis TTS collection 、 Wake up word collection 、 Multiplayer conversation collection 、 Vehicle voice acquisition 、 Mandarin collection 、 Dialect collection 、 English collection 、 Collection of small languages 、 Near and far field acquisition 、 voice VAD Collection, etc . It can be designed according to the scheme , For target areas 、 Collect the specific data of the scene .

Jinglianwen technology has successively established Hangzhou data headquarters , wuhan 、 jinhua 、 Data processing divisions in different provinces and cities such as Hengyang , Adopt amoeba internal competition management mode , Cultivate the 930 A full-time team of people , Research and develop jinglianwen technology data annotation platform , Support ASR Voice transcribe 、 Speech cutting 、 Voice cleaning 、 Emotional judgment 、 Voiceprint recognition 、 Phoneme labeling 、 Prosody tagging 、 Pronunciation proofreading , Meet the data annotation requirements of the diversity and richness of artificial intelligence .

原网站

版权声明
本文为[Jinglianwen Technology]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/164/202206130619525880.html