当前位置:网站首页>text 文本数据增强方法 data argumentation
text 文本数据增强方法 data argumentation
2022-07-06 09:11:00 【一曲无痕奈何】
知识点:text 数据增强 data argumentation
random insertion 随机插入
random deletion 随机删除
random swap 随机交换
参考论文: EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Back Translation
举例: 英语 --> 中文 --> 英语
# 需要安装 : pip install google_trans_new
from google_trans_new import google_translator
translator = google_translator()
sentence = ['stay hungry, stay foolish. -- spoken / said by Steve Jobs']
# 英 --> 中
translation_cn = translator.translate(sentence, lang_tgt='zh-cn')
translation_cn
# 中 --> 英
translation_en = translator.translate(translation_cn, lang_tgt='en')
translation_en
随机选择一种语言翻译
import random
import google_trans_new
languages = list(google_trans_new.LANGUAGES.keys())
len(languages) # 可翻译的语言种类 108 种
object_lang = random.choice(languages)
object_lang
# 正向翻译
translations = translator.translate(sentence, lang_tgt=object_lang)
translations
# 反向翻译
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
# 反向翻译
back_trans = translator.translate(translations, lang_tgt='en')
back_trans边栏推荐
- C miscellaneous shallow copy and deep copy
- C杂讲 文件 初讲
- MySQL combat optimization expert 12 what does the memory data structure buffer pool look like?
- Write your own CPU Chapter 10 - learning notes
- 15 医疗挂号系统_【预约挂号】
- Good blog good material record link
- 16 医疗挂号系统_【预约下单】
- Super detailed steps for pushing wechat official account H5 messages
- Why is 51+ assembly in college SCM class? Why not come directly to STM32
- CANoe CAPL文件操作目录合集
猜你喜欢

How to build an interface automation testing framework?
![[after reading the series] how to realize app automation without programming (automatically start Kwai APP)](/img/e1/bad9cfa70d3c533cfaddeee40b96f1.jpg)
[after reading the series] how to realize app automation without programming (automatically start Kwai APP)

Jar runs with error no main manifest attribute

Embedded development is much more difficult than MCU? Talk about SCM and embedded development and design experience

max-flow min-cut

寶塔的安裝和flask項目部署

如何让shell脚本变成可执行文件
![[Julia] exit notes - Serial](/img/d0/87f0d57ff910a666fbb67c0ae8a838.jpg)
[Julia] exit notes - Serial

Contest3145 - the 37th game of 2021 freshman individual training match_ C: Tour guide

四川云教和双师模式
随机推荐
MySQL底层的逻辑架构
16 medical registration system_ [order by appointment]
宝塔的安装和flask项目部署
Which is the better prospect for mechanical engineer or Electrical Engineer?
15 medical registration system_ [appointment registration]
Compress decompress
A necessary soft skill for Software Test Engineers: structured thinking
MySQL combat optimization expert 09 production experience: how to deploy a monitoring system for a database in a production environment?
The replay block of canoe still needs to be combined with CAPL script to make it clear
MySQL实战优化高手02 为了执行SQL语句,你知道MySQL用了什么样的架构设计吗?
实现以form-data参数发送post请求
Security design verification of API interface: ticket, signature, timestamp
Hugo blog graphical writing tool -- QT practice
C miscellaneous lecture continued
MySQL learning diary (II)
Bugku web guide
Redis集群方案应该怎么做?都有哪些方案?
解决在window中远程连接Linux下的MySQL
MySQL combat optimization expert 06 production experience: how does the production environment database of Internet companies conduct performance testing?
Use xtrabackup for MySQL database physical backup