当前位置:网站首页>NLP commonly used Backbone model cheat sheet (1)
NLP commonly used Backbone model cheat sheet (1)
2022-08-03 01:50:00 【Andy Dennis】
Foreword
Since the appearance of Transformer in 2017, it has appeared in all major NLP jobs.Recently, Stanford also opened a course CS25 specifically for transformers: [Stanford] CS25 Transformers United | Fall 2021
People who are new to NLP can read an article I wrote earlier Research 0_NLPer set off
For the corresponding model, you can go to hugginface's transformers library to see transformers/models (github), you can find the corresponding model to see its source code implementation.
Now it is mainly the dynamic word vector coding technology combined with the context, and the word2vec and glove vocabulary are rarely used for static word vector mapping.
B station a video Blow up!Doctor of Computer [NLP Natural Language Processing] is worthy of being a professor of Tsinghua University!5 hours got me done with NLP Natural Language Processing! (虽然标题有些emm…但是看了一下目录啥的好像还行…
Thesis
Mass
Bart
T5
Exploring the Limits of Transfer Learning with a Unified
Text-to-Text Transformer
Bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
encoder structure.There are many bert families, such as the distilled version distilBert, the variant Roberta, etc.
Word vector input composition:

transformer
The famous self-attention comes from this article.
Attention Is All You Need
This model has been reproduced before: Transformer structure reproduction__attention is all you need (pytorch)
encoder-decoder structure:
Attention模块:
边栏推荐
猜你喜欢

基于STM32设计的老人防摔倒报警设备(OneNet)

如何使用vlookup+excel数组公式 完成逆向查找?

数据库审计 - 网络安全的重要组成部分

别再到处乱放配置文件了!我司使用 7 年的这套解决方案,稳的一秕

Connect the Snowflake of CKAN tutorial CKAN to release to open data portal

Cholesterol-PEG-Acid,胆固醇-聚乙二醇-羧基保持在干燥、低温环境下

What is the matter that programmers often say "the left hand is knuckled and the right hand is hot"?

程序员如何优雅地解决线上问题?

Merge two excel spreadsheet tools

语音合成模型小抄(1)
随机推荐
@GetMapping、@PostMapping、@PutMapping、@DeleteMapping的区别
程序员如何优雅地解决线上问题?
优秀论文以及思路分析02
Week 7 - Distributional Representations(分布表示)
flutter 每个要注意的点
vant-swipe自适应图片高度+图片预览
js基础知识整理之 —— 字符串
D experimental new anomaly
matplotlib中的3D绘图警告解决:MatplotlibDeprecationWarning: Axes3D(fig) adding itself to the figure
十年架构五年生活-03作为技术组长的困扰
嵌入式分享合集26
Moco of Mock tools use tutorial
定了!8月起,网易将为本号粉丝提供数据分析培训,费用全免!
HVV红队 | 渗透测试思路整理
脂溶性胆固醇-聚乙二醇-叠氮,Cholesterol-PEG-Azide,CLS-PEG-N3
记一次mysql查询慢的优化历程
语音合成模型小抄(1)
DownMusic summary record
2022 China Eye Expo, Shandong Eye Health Exhibition, Vision Correction Instrument Exhibition, Eye Care Products Exhibition
华为设备配置BFD与接口联动(触发与BFD联动的接口物理状态变为Down)