当前位置:网站首页>text 文本数据增强方法 data argumentation
text 文本数据增强方法 data argumentation
2022-07-06 09:11:00 【一曲无痕奈何】
知识点:text 数据增强 data argumentation
random insertion 随机插入
random deletion 随机删除
random swap 随机交换
参考论文: EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Back Translation
举例: 英语 --> 中文 --> 英语
# 需要安装 : pip install google_trans_new
from google_trans_new import google_translator
translator = google_translator()
sentence = ['stay hungry, stay foolish. -- spoken / said by Steve Jobs']
# 英 --> 中
translation_cn = translator.translate(sentence, lang_tgt='zh-cn')
translation_cn
# 中 --> 英
translation_en = translator.translate(translation_cn, lang_tgt='en')
translation_en
随机选择一种语言翻译
import random
import google_trans_new
languages = list(google_trans_new.LANGUAGES.keys())
len(languages) # 可翻译的语言种类 108 种
object_lang = random.choice(languages)
object_lang
# 正向翻译
translations = translator.translate(sentence, lang_tgt=object_lang)
translations
# 反向翻译
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
# 反向翻译
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
边栏推荐
- MySQL combat optimization expert 09 production experience: how to deploy a monitoring system for a database in a production environment?
- Delayed note learning
- C杂讲 双向循环链表
- Security design verification of API interface: ticket, signature, timestamp
- South China Technology stack cnn+bilstm+attention
- 13 medical registration system_ [wechat login]
- Retention policy of RMAN backup
- The replay block of canoe still needs to be combined with CAPL script to make it clear
- How to build an interface automation testing framework?
- 四川云教和双师模式
猜你喜欢
South China Technology stack cnn+bilstm+attention
A necessary soft skill for Software Test Engineers: structured thinking
Contest3145 - the 37th game of 2021 freshman individual training match_ C: Tour guide
Notes of Dr. Carolyn ROS é's social networking speech
MySQL实战优化高手11 从数据的增删改开始讲起,回顾一下Buffer Pool在数据库里的地位
MySQL combat optimization expert 03 uses a data update process to preliminarily understand the architecture design of InnoDB storage engine
四川云教和双师模式
Several silly built-in functions about relative path / absolute path operation in CAPL script
西南大学:胡航-关于学习行为和学习效果分析
Const decorated member function problem
随机推荐
17 医疗挂号系统_【微信支付】
Random notes
Bugku web guide
CANoe CAPL文件操作目录合集
Software test engineer development planning route
[CV] target detection: derivation of common terms and map evaluation indicators
Sed text processing
C miscellaneous shallow copy and deep copy
Teach you how to write the first MCU program hand in hand
MySQL实战优化高手11 从数据的增删改开始讲起,回顾一下Buffer Pool在数据库里的地位
A necessary soft skill for Software Test Engineers: structured thinking
The programming ranking list came out in February. Is the result as you expected?
MySQL learning diary (II)
PR 2021 quick start tutorial, first understanding the Premiere Pro working interface
C杂讲 文件 续讲
vscode 常用的指令
如何搭建接口自动化测试框架?
CDC: the outbreak of Listeria monocytogenes in the United States is related to ice cream products
MySQL combat optimization expert 04 uses the execution process of update statements in the InnoDB storage engine to talk about what binlog is?
MySQL實戰優化高手04 借著更新語句在InnoDB存儲引擎中的執行流程,聊聊binlog是什麼?