当前位置:网站首页>Jinglianwen technology provides voice data acquisition and labeling services
Jinglianwen technology provides voice data acquisition and labeling services
2022-06-13 06:40:00 【Jinglianwen Technology】
What is voice tagging ?
Voice annotation is a common annotation type in the data annotation industry , The annotator marks and transcribes the voice information continuously , Let the manual system learn further , The marked data is mainly used for artificial intelligence machine learning , It is equivalent to installing the computer system “ Ears ”, Make it have “ Can hear ” The function of , So that the computer can have accurate speech recognition ability .
Voice annotation mainly includes ASR Voice transcribe 、 Speech cutting 、 Voice cleaning 、 Cleaning judgment 、 Voiceprint recognition 、 Phoneme labeling 、 Prosody tagging 、 Pronunciation proofreading these eight common ways of marking .
Voice tagging is closely related to artificial intelligence , At present , Speech recognition technology has been popularized in many aspects of daily life , Such as voice assistant 、 Intelligent speakers 、 Intelligent customer service, etc . With the development of artificial intelligence , The human-computer voice interaction scene will extend in more directions , In recognition accuracy 、 Scene optimization 、 It puts forward higher requirements for speech recognition technology .
AI The importance of data
In recent years, , Artificial intelligence continues to develop , The tool chain that enables AI is not perfect . Data is one of the core elements of AI iterative innovation , Optimize training data AI The model is an important way to further improve the accuracy . For advancement AI Apply high-quality landing , Basic data service providers of artificial intelligence need to collect data 、 cleaning 、 Information extraction 、 mark 、 Quality testing 、 Management and other links are more finely controlled , To provide higher quality data .
Jinglianwen technology provides data support for voice annotation
Jinglianwen technology is the largest enterprise in the Yangtze River Delta AI One of the basic data service providers , The existing database has a voice dataset 100T, Language readings covering tens of thousands of hours have been collected 、 Natural language conversation voice data , It can quickly provide data sets that meet the requirements . for example 《50800 Data set of recording and acquisition in depot 》、《60000 Segment Chinese voice data set 》、《100 individual id12000 A data set of Chinese reading English wake-up words 》、《21000 paragraph ASR Voice transcribe audio training set 》、《13000 Segment speech cutting audio training set 》 And other data sets that can be used to study the algorithms of speech recognition technology , It can effectively improve the test efficiency .
Jinglianwen technology has built a national 27 Provinces, cities and municipalities directly under the central government are all over the world 52 Data collection resource networks in countries , Rich in dialects , Collection channels for small languages 、 Scene building ability , Special scene data acquisition capability , Support speech recognition ASR collection 、 speech synthesis TTS collection 、 Wake up word collection 、 Multiplayer conversation collection 、 Vehicle voice acquisition 、 Mandarin collection 、 Dialect collection 、 English collection 、 Collection of small languages 、 Near and far field acquisition 、 voice VAD Collection, etc . It can be designed according to the scheme , For target areas 、 Collect the specific data of the scene .
Jinglianwen technology has successively established Hangzhou data headquarters , wuhan 、 jinhua 、 Data processing divisions in different provinces and cities such as Hengyang , Adopt amoeba internal competition management mode , Cultivate the 930 A full-time team of people , Research and develop jinglianwen technology data annotation platform , Support ASR Voice transcribe 、 Speech cutting 、 Voice cleaning 、 Emotional judgment 、 Voiceprint recognition 、 Phoneme labeling 、 Prosody tagging 、 Pronunciation proofreading , Meet the data annotation requirements of the diversity and richness of artificial intelligence .
边栏推荐
- App performance test: (I) startup time
- MFS details (vii) - - MFS client and Web Monitoring installation configuration
- 智能文娱稳步发展,景联文科技提供数据采集标注服务
- Multithreading tests network conditions. Machines in different network segments use nbtstat to judge whether they are powered on
- Use of smalidea
- Excel data into database
- Scrcpy development environment construction and source code reading
- Differences among concurrent, parallel, serial, synchronous and asynchronous
- Overview of demoplayer program framework design of ijkplayer
- JNI exception handling
猜你喜欢
JVM Foundation
MFS詳解(七)——MFS客戶端與web監控安裝配置
欧姆龙平替国产大货—JY-V640半导体晶元盒读写器
Two uses of bottomsheetbehavior
【Kernel】驱动编译的两种方式:编译成模块、编译进内核(使用杂项设备驱动模板)
智能文娱稳步发展,景联文科技提供数据采集标注服务
[2022 college entrance examination season] what I want to say as a passer-by
MFS explanation (V) -- MFS metadata log server installation and configuration
Glide usage notes
JetPack - - - DataBinding
随机推荐
Omron Ping replaces the large domestic product jy-v640 semiconductor wafer box reader
[solution] camunda deployment process should point to a running platform rest API
《MATLAB 神经网络43个案例分析》:第11章 连续Hopfield神经网络的优化——旅行商问题优化计算
【sketchup 2021】草图大师中CAD文件的导入与建模(利用cad图纸在草图大师中建立立面模型)、草图大师导出成品为dwg格式的二维、三维、立面效果到cad中打开预览】
Scrcpy source code walk 2 how to connect a client to a mobile server
Kotlin data flow - flow
Kotlin collaboration -- context and exception handling
Kotlin basic objects, classes and interfaces
无刷直流电机矢量控制(四):基于滑模观测器的无传感器控制
1154. 一年中的第几天
App performance test: (I) startup time
端午安康,使用祝福话语生成词云吧
线程池学习
想进行快速钢网设计,还能保证钢网质量? 来看这里
Thread correlation point
如何使用望友DFM软件进行冷板分析
BlockingQueue source code
105. 从前序与中序遍历序列构造二叉树
【Kernel】驱动编译的两种方式:编译成模块、编译进内核(使用杂项设备驱动模板)
Interface oriented programming in C language