当前位置:网站首页>NLP commonly used Backbone model cheat sheet (1)
NLP commonly used Backbone model cheat sheet (1)
2022-08-03 01:50:00 【Andy Dennis】
Foreword
Since the appearance of Transformer in 2017, it has appeared in all major NLP jobs.Recently, Stanford also opened a course CS25 specifically for transformers: [Stanford] CS25 Transformers United | Fall 2021
People who are new to NLP can read an article I wrote earlier Research 0_NLPer set off
For the corresponding model, you can go to hugginface's transformers library to see transformers/models (github), you can find the corresponding model to see its source code implementation.
Now it is mainly the dynamic word vector coding technology combined with the context, and the word2vec and glove vocabulary are rarely used for static word vector mapping.
B station a video Blow up!Doctor of Computer [NLP Natural Language Processing] is worthy of being a professor of Tsinghua University!5 hours got me done with NLP Natural Language Processing! (虽然标题有些emm…但是看了一下目录啥的好像还行…
Thesis
Mass
Bart
T5
Exploring the Limits of Transfer Learning with a Unified
Text-to-Text Transformer
Bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
encoder structure.There are many bert families, such as the distilled version distilBert, the variant Roberta, etc.
Word vector input composition:

transformer
The famous self-attention comes from this article.
Attention Is All You Need
This model has been reproduced before: Transformer structure reproduction__attention is all you need (pytorch)
encoder-decoder structure:
Attention模块:
边栏推荐
- 别再到处乱放配置文件了!我司使用 7 年的这套解决方案,稳的一秕
- KubeSphere监控失效为NAN的问题
- CAS:1445723-73-8,DSPE-PEG-NHS,磷脂-聚乙二醇-活性酯两亲性脂质PEG共轭物
- 九零后程序员心声:互联网的同行们,别卷了,再卷人都卷没了
- 微信小程序实现lot开发09 接入微信登录
- I have been in the software testing industry for nearly 20 years, let me talk to you about today's software testing
- Strict feedback nonlinear systems based on event trigger preset since the immunity of finite time tracking control
- HCIP(17)
- 漫画:怎么证明sleep不释放锁,而wait释放锁?
- js基础知识整理之 —— Date和定时器
猜你喜欢

Rebound shell principle and implementation

js基础知识整理之 —— 获取元素和命名规范

00 -- jieba分词

合并两个excel表格工具

Week 7 CNN Architectures - LeNet-5、AlexNet、VGGNet、GoogLeNet、ResNet

Jmeter secondary development to realize rsa encryption

WebShell 木马免杀过WAF

What is the matter that programmers often say "the left hand is knuckled and the right hand is hot"?

RollBack Rx Professional RMC 安装教程

Merge two excel spreadsheet tools
随机推荐
微信小程序实现lot开发09 接入微信登录
简单聊聊MySQL中的六种日志
pytest-常用运行参数
IDEA多线程调试
典型相关分析CCA计算过程
即席查询—— Kylin使用
Based on two levels of decomposition and the length of the memory network multi-step combined forecasting model of short-term wind speed
函数:计算组合数
Jmeter secondary development to realize rsa encryption
机器学习-特征映射方法
21天学习挑战赛(1)设备树的由来
js基础知识整理之 —— 全局作用域
js基础知识整理之 —— 变量和数据类型
科研用Cholesterol-PEG-NHS,NHS-PEG-CLS,胆固醇-聚乙二醇-活性酯
Apache Doris 1.1 特性揭秘:Flink 实时写入如何兼顾高吞吐和低延时
D with json
程序员英语自我介绍
threejs dynamically adjust the camera position so that the camera can see the object exactly
用了 TCP 协议,数据一定不会丢吗?
2022 China Eye Expo, Shandong Eye Health Exhibition, Vision Correction Instrument Exhibition, Eye Care Products Exhibition