当前位置:网站首页>Texttext data enhancement method data argument
Texttext data enhancement method data argument
2022-07-06 10:26:00 【How about a song without trace】
Knowledge point :text Data to enhance data argumentation
random insertion Insert randomly
random deletion Random delete
random swap Random exchange
Reference paper : EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Back Translation
give an example : English --> chinese --> English
# Need to install : pip install google_trans_new
from google_trans_new import google_translator
translator = google_translator()
sentence = ['stay hungry, stay foolish. -- spoken / said by Steve Jobs']
# Britain --> in
translation_cn = translator.translate(sentence, lang_tgt='zh-cn')
translation_cn
# in --> Britain
translation_en = translator.translate(translation_cn, lang_tgt='en')
translation_en
Choose a language translation randomly
import random
import google_trans_new
languages = list(google_trans_new.LANGUAGES.keys())
len(languages) # Translatable languages 108 Kind of
object_lang = random.choice(languages)
object_lang
# Forward translation
translations = translator.translate(sentence, lang_tgt=object_lang)
translations
# Reverse translation
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
# Reverse translation
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
边栏推荐
- MySQL34-其他数据库日志
- 16 medical registration system_ [order by appointment]
- 软件测试工程师必备之软技能:结构化思维
- UnicodeDecodeError: ‘utf-8‘ codec can‘t decode byte 0xd0 in position 0成功解决
- MySQL的存储引擎
- Time complexity (see which sentence is executed the most times)
- Contest3145 - the 37th game of 2021 freshman individual training match_ B: Password
- Simple solution to phpjm encryption problem free phpjm decryption tool
- Sichuan cloud education and double teacher model
- 西南大学:胡航-关于学习行为和学习效果分析
猜你喜欢
[after reading the series] how to realize app automation without programming (automatically start Kwai APP)
Target detection -- yolov2 paper intensive reading
Sichuan cloud education and double teacher model
16 医疗挂号系统_【预约下单】
Record the first JDBC
Complete web login process through filter
ByteTrack: Multi-Object Tracking by Associating Every Detection Box 论文阅读笔记()
13 医疗挂号系统_【 微信登录】
基于Pytorch的LSTM实战160万条评论情感分类
Introduction tutorial of typescript (dark horse programmer of station B)
随机推荐
A necessary soft skill for Software Test Engineers: structured thinking
西南大学:胡航-关于学习行为和学习效果分析
MySQL combat optimization expert 02 in order to execute SQL statements, do you know what kind of architectural design MySQL uses?
17 医疗挂号系统_【微信支付】
MySQL combat optimization expert 05 production experience: how to plan the database machine configuration in the real production environment?
Jar runs with error no main manifest attribute
13 medical registration system_ [wechat login]
Contest3145 - the 37th game of 2021 freshman individual training match_ C: Tour guide
17 medical registration system_ [wechat Payment]
How to build an interface automation testing framework?
Google login prompt error code 12501
Mysql36 database backup and recovery
Target detection -- yolov2 paper intensive reading
简单解决phpjm加密问题 免费phpjm解密工具
A necessary soft skill for Software Test Engineers: structured thinking
docker MySQL解决时区问题
颜值爆表,推荐两款JSON可视化工具,配合Swagger使用真香
[unity] simulate jelly effect (with collision) -- tutorial on using jellysprites plug-in
【C语言】深度剖析数据存储的底层原理
Pytorch LSTM实现流程(可视化版本)