当前位置:网站首页>text 文本数据增强方法 data argumentation
text 文本数据增强方法 data argumentation
2022-07-06 09:11:00 【一曲无痕奈何】
知识点:text 数据增强 data argumentation
random insertion 随机插入
random deletion 随机删除
random swap 随机交换
参考论文: EDA : Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks
Back Translation
举例: 英语 --> 中文 --> 英语
# 需要安装 : pip install google_trans_new
from google_trans_new import google_translator
translator = google_translator()
sentence = ['stay hungry, stay foolish. -- spoken / said by Steve Jobs']
# 英 --> 中
translation_cn = translator.translate(sentence, lang_tgt='zh-cn')
translation_cn
# 中 --> 英
translation_en = translator.translate(translation_cn, lang_tgt='en')
translation_en
随机选择一种语言翻译
import random
import google_trans_new
languages = list(google_trans_new.LANGUAGES.keys())
len(languages) # 可翻译的语言种类 108 种
object_lang = random.choice(languages)
object_lang
# 正向翻译
translations = translator.translate(sentence, lang_tgt=object_lang)
translations
# 反向翻译
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
# 反向翻译
back_trans = translator.translate(translations, lang_tgt='en')
back_trans
边栏推荐
- How to make shell script executable
- Redis集群方案应该怎么做?都有哪些方案?
- CANoe下载地址以及CAN Demo 16的下载与激活,并附录所有CANoe软件版本
- MySQL實戰優化高手04 借著更新語句在InnoDB存儲引擎中的執行流程,聊聊binlog是什麼?
- [flask] crud addition and query operation of data
- 112 pages of mathematical knowledge sorting! Machine learning - a review of fundamentals of mathematics pptx
- 通过bat脚本配置系统环境变量
- The governor of New Jersey signed seven bills to improve gun safety
- 14 医疗挂号系统_【阿里云OSS、用户认证与就诊人】
- 四川云教和双师模式
猜你喜欢
寶塔的安裝和flask項目部署
CAPL脚本中关于相对路径/绝对路径操作的几个傻傻分不清的内置函数
实现以form-data参数发送post请求
[CV] target detection: derivation of common terms and map evaluation indicators
jar运行报错no main manifest attribute
15 medical registration system_ [appointment registration]
Canoe cannot automatically identify serial port number? Then encapsulate a DLL so that it must work
[flask] crud addition and query operation of data
The 32-year-old fitness coach turned to a programmer and got an offer of 760000 a year. The experience of this older coder caused heated discussion
MySQL實戰優化高手04 借著更新語句在InnoDB存儲引擎中的執行流程,聊聊binlog是什麼?
随机推荐
西南大学:胡航-关于学习行为和学习效果分析
How to build an interface automation testing framework?
Delayed note learning
Implement context manager through with
cmooc互联网+教育
Download address of canoe, download and activation of can demo 16, and appendix of all canoe software versions
Embedded development is much more difficult than MCU? Talk about SCM and embedded development and design experience
Hugo blog graphical writing tool -- QT practice
MySQL实战优化高手07 生产经验:如何对生产环境中的数据库进行360度无死角压测?
South China Technology stack cnn+bilstm+attention
美疾控中心:美国李斯特菌疫情暴发与冰激凌产品有关
MySQL real battle optimization expert 08 production experience: how to observe the machine performance 360 degrees without dead angle in the process of database pressure test?
History of object recognition
If a university wants to choose to study automation, what books can it read in advance?
[Julia] exit notes - Serial
寶塔的安裝和flask項目部署
A necessary soft skill for Software Test Engineers: structured thinking
NLP路线和资源
Sed text processing
Installation of pagoda and deployment of flask project