当前位置:网站首页>NLP commonly used Backbone model cheat sheet (1)
NLP commonly used Backbone model cheat sheet (1)
2022-08-03 01:50:00 【Andy Dennis】
Foreword
Since the appearance of Transformer in 2017, it has appeared in all major NLP jobs.Recently, Stanford also opened a course CS25 specifically for transformers: [Stanford] CS25 Transformers United | Fall 2021
People who are new to NLP can read an article I wrote earlier Research 0_NLPer set off
For the corresponding model, you can go to hugginface's transformers library to see transformers/models (github), you can find the corresponding model to see its source code implementation.
Now it is mainly the dynamic word vector coding technology combined with the context, and the word2vec and glove vocabulary are rarely used for static word vector mapping.
B station a video Blow up!Doctor of Computer [NLP Natural Language Processing] is worthy of being a professor of Tsinghua University!5 hours got me done with NLP Natural Language Processing! (虽然标题有些emm…但是看了一下目录啥的好像还行…
Thesis
Mass
Bart
T5
Exploring the Limits of Transfer Learning with a Unified
Text-to-Text Transformer
Bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
encoder structure.There are many bert families, such as the distilled version distilBert, the variant Roberta, etc.
Word vector input composition:
transformer
The famous self-attention comes from this article.
Attention Is All You Need
This model has been reproduced before: Transformer structure reproduction__attention is all you need (pytorch)
encoder-decoder structure:
Attention模块:
边栏推荐
猜你喜欢
随机推荐
聚乙二醇衍生物4-Arm PEG-DSPE,四臂-聚乙二醇-磷脂
停止使用 Storyboards 和 Interface Builder
I have been in the software testing industry for nearly 20 years, let me talk to you about today's software testing
Teach you to locate online MySQL slow query problem hand by hand, package teaching package meeting
Strict feedback nonlinear systems based on event trigger preset since the immunity of finite time tracking control
2022中国眼博会,山东眼健康展,视力矫正仪器展,护眼产品展
Day117. Shangyitong: Generate registered order module
vant-swipe自适应图片高度+图片预览
js基础知识整理之 —— 判断语句和三元运算符
简单聊聊MySQL中的六种日志
js基础知识整理之 —— 获取元素和命名规范
vant-swipe adaptive picture height + picture preview
Speech Synthesis Model Cheat Sheet (1)
Apache Doris 1.1 特性揭秘:Flink 实时写入如何兼顾高吞吐和低延时
# DWD层及DIM层构建## ,220801 ,
十年架构五年生活-04第一个工作转折点
Image recognition from zero to write DNF script key points
2022第十一届财经峰会:优炫软件斩获双项大奖
厌倦了安装数据库?改用 Docker
数据库审计 - 网络安全的重要组成部分