当前位置:网站首页>Jinglianwen technology provides voice data acquisition and labeling services
Jinglianwen technology provides voice data acquisition and labeling services
2022-06-13 06:40:00 【Jinglianwen Technology】
What is voice tagging ?
Voice annotation is a common annotation type in the data annotation industry , The annotator marks and transcribes the voice information continuously , Let the manual system learn further , The marked data is mainly used for artificial intelligence machine learning , It is equivalent to installing the computer system “ Ears ”, Make it have “ Can hear ” The function of , So that the computer can have accurate speech recognition ability .
Voice annotation mainly includes ASR Voice transcribe 、 Speech cutting 、 Voice cleaning 、 Cleaning judgment 、 Voiceprint recognition 、 Phoneme labeling 、 Prosody tagging 、 Pronunciation proofreading these eight common ways of marking .
Voice tagging is closely related to artificial intelligence , At present , Speech recognition technology has been popularized in many aspects of daily life , Such as voice assistant 、 Intelligent speakers 、 Intelligent customer service, etc . With the development of artificial intelligence , The human-computer voice interaction scene will extend in more directions , In recognition accuracy 、 Scene optimization 、 It puts forward higher requirements for speech recognition technology .

AI The importance of data
In recent years, , Artificial intelligence continues to develop , The tool chain that enables AI is not perfect . Data is one of the core elements of AI iterative innovation , Optimize training data AI The model is an important way to further improve the accuracy . For advancement AI Apply high-quality landing , Basic data service providers of artificial intelligence need to collect data 、 cleaning 、 Information extraction 、 mark 、 Quality testing 、 Management and other links are more finely controlled , To provide higher quality data .
Jinglianwen technology provides data support for voice annotation
Jinglianwen technology is the largest enterprise in the Yangtze River Delta AI One of the basic data service providers , The existing database has a voice dataset 100T, Language readings covering tens of thousands of hours have been collected 、 Natural language conversation voice data , It can quickly provide data sets that meet the requirements . for example 《50800 Data set of recording and acquisition in depot 》、《60000 Segment Chinese voice data set 》、《100 individual id12000 A data set of Chinese reading English wake-up words 》、《21000 paragraph ASR Voice transcribe audio training set 》、《13000 Segment speech cutting audio training set 》 And other data sets that can be used to study the algorithms of speech recognition technology , It can effectively improve the test efficiency .
Jinglianwen technology has built a national 27 Provinces, cities and municipalities directly under the central government are all over the world 52 Data collection resource networks in countries , Rich in dialects , Collection channels for small languages 、 Scene building ability , Special scene data acquisition capability , Support speech recognition ASR collection 、 speech synthesis TTS collection 、 Wake up word collection 、 Multiplayer conversation collection 、 Vehicle voice acquisition 、 Mandarin collection 、 Dialect collection 、 English collection 、 Collection of small languages 、 Near and far field acquisition 、 voice VAD Collection, etc . It can be designed according to the scheme , For target areas 、 Collect the specific data of the scene .
Jinglianwen technology has successively established Hangzhou data headquarters , wuhan 、 jinhua 、 Data processing divisions in different provinces and cities such as Hengyang , Adopt amoeba internal competition management mode , Cultivate the 930 A full-time team of people , Research and develop jinglianwen technology data annotation platform , Support ASR Voice transcribe 、 Speech cutting 、 Voice cleaning 、 Emotional judgment 、 Voiceprint recognition 、 Phoneme labeling 、 Prosody tagging 、 Pronunciation proofreading , Meet the data annotation requirements of the diversity and richness of artificial intelligence .

边栏推荐
- Command line for database
- If the key in redis data is in Chinese
- 机器学习笔记 - 监督学习备忘清单
- MFS详解(六)——MFS Chunk Server服务器安装与配置
- 347. top k high frequency elements heap sort + bucket sort +map
- MFS details (VII) -- MFS client and web monitoring installation configuration
- package-lock. json
- Ijkplayer compilation process record
- Usegeneratedkeys=true configuration
- Scrcpy development environment construction and source code reading
猜你喜欢

MFS explanation (VI) -- MFS chunk server installation and configuration

Construction and verification of Alibaba cloud server webrtc system

【新手上路常见问答】一步一步理解程序设计

Dragon Boat Festival wellbeing, use blessing words to generate word cloud

Base64 principle

Relationship between fragment lifecycle and activity

SSM integration

Data storage in memory (C language)

景联文科技提供语音数据采集标注服务
![[kernel] two methods of driver compilation: compiling into modules and compiling into the kernel (using miscellaneous device driver templates)](/img/7a/c8d5273e0a47d2d4b048a2905d0b16.png)
[kernel] two methods of driver compilation: compiling into modules and compiling into the kernel (using miscellaneous device driver templates)
随机推荐
[solution] camunda deployment process should point to a running platform rest API
Relationship between fragment lifecycle and activity
Using the shutter floor database framework
Common websites and tools
Array operations in JS
MFS详解(五)——MFS元数据日志服务器安装与配置
楊輝三角形詳解
数据在内存中的存储(C语言)
Analysis of synchronized
App performance test: (II) CPU
Explication détaillée du triangle Yang hui
Vector control of Brushless DC motor (4): sensorless control based on sliding mode observer
JetPack - - - Navigation
如何使用望友DFM软件进行冷板分析
Glide usage notes
Brief introduction to basic usage of echart
景联文科技:数据标注行业现状及解决方案
【虚拟机】 VMware虚拟机占用空间过大解决
Outil de formatage du temps - mode. JS (affichage en temps réel du temps Web)
JS method of extracting numbers from strings