当前位置:网站首页>Texttext data enhancement method data argument
Texttext data enhancement method data argument
2022-07-06 10:26:00 【How about a song without trace】
Knowledge point :text Data to enhance data argumentation
random insertion Insert randomly
random deletion Random delete
random swap Random exchange
Reference paper : EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Back Translation
give an example : English --> chinese --> English
# Need to install : pip install google_trans_new
from google_trans_new import google_translator
translator = google_translator()
sentence = ['stay hungry, stay foolish. -- spoken / said by Steve Jobs']
# Britain --> in
translation_cn = translator.translate(sentence, lang_tgt='zh-cn')
translation_cn
# in --> Britain
translation_en = translator.translate(translation_cn, lang_tgt='en')
translation_en
Choose a language translation randomly
import random
import google_trans_new
languages = list(google_trans_new.LANGUAGES.keys())
len(languages) # Translatable languages 108 Kind of
object_lang = random.choice(languages)
object_lang
# Forward translation
translations = translator.translate(sentence, lang_tgt=object_lang)
translations
# Reverse translation
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
# Reverse translation
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
边栏推荐
- Constants and pointers
- Solution to the problem of cross domain inaccessibility of Chrome browser
- MySQL combat optimization expert 04 uses the execution process of update statements in the InnoDB storage engine to talk about what binlog is?
- Google login prompt error code 12501
- Routes and resources of AI
- MySQL34-其他数据库日志
- UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xd0 in position 0成功解决
- Typescript入门教程(B站黑马程序员)
- 15 医疗挂号系统_【预约挂号】
- 【C语言】深度剖析数据存储的底层原理
猜你喜欢
UEditor国际化配置,支持中英文切换
jar运行报错no main manifest attribute
16 medical registration system_ [order by appointment]
The 32 year old programmer left and was admitted by pinduoduo and foreign enterprises. After drying out his annual salary, he sighed: it's hard to choose
C miscellaneous two-way circular linked list
该不会还有人不懂用C语言写扫雷游戏吧
Routes and resources of AI
MySQL的存储引擎
A necessary soft skill for Software Test Engineers: structured thinking
基于Pytorch肺部感染识别案例(采用ResNet网络结构)
随机推荐
Mysql32 lock
MySQL实战优化高手08 生产经验:在数据库的压测过程中,如何360度无死角观察机器性能?
MySQL real battle optimization expert 08 production experience: how to observe the machine performance 360 degrees without dead angle in the process of database pressure test?
What is the current situation of the game industry in the Internet world?
The 32 year old programmer left and was admitted by pinduoduo and foreign enterprises. After drying out his annual salary, he sighed: it's hard to choose
Implement context manager through with
Flash operation and maintenance script (running for a long time)
text 文本数据增强方法 data argumentation
Simple solution to phpjm encryption problem free phpjm decryption tool
Complete web login process through filter
ByteTrack: Multi-Object Tracking by Associating Every Detection Box 论文阅读笔记()
Set shell script execution error to exit automatically
MySQL底层的逻辑架构
Ueeditor internationalization configuration, supporting Chinese and English switching
MySQL combat optimization expert 10 production experience: how to deploy visual reporting system for database monitoring system?
UEditor国际化配置,支持中英文切换
第一篇博客
Const decorated member function problem
Typescript入门教程(B站黑马程序员)
Time in TCP state_ The role of wait?