当前位置:网站首页>NLP commonly used Backbone model cheat sheet (1)
NLP commonly used Backbone model cheat sheet (1)
2022-08-03 01:50:00 【Andy Dennis】
Foreword
Since the appearance of Transformer in 2017, it has appeared in all major NLP jobs.Recently, Stanford also opened a course CS25 specifically for transformers: [Stanford] CS25 Transformers United | Fall 2021
People who are new to NLP can read an article I wrote earlier Research 0_NLPer set off
For the corresponding model, you can go to hugginface's transformers library to see transformers/models (github), you can find the corresponding model to see its source code implementation.
Now it is mainly the dynamic word vector coding technology combined with the context, and the word2vec and glove vocabulary are rarely used for static word vector mapping.
B station a video Blow up!Doctor of Computer [NLP Natural Language Processing] is worthy of being a professor of Tsinghua University!5 hours got me done with NLP Natural Language Processing! (虽然标题有些emm…但是看了一下目录啥的好像还行…
Thesis
Mass
Bart
T5
Exploring the Limits of Transfer Learning with a Unified
Text-to-Text Transformer
Bert
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
encoder structure.There are many bert families, such as the distilled version distilBert, the variant Roberta, etc.
Word vector input composition:

transformer
The famous self-attention comes from this article.
Attention Is All You Need
This model has been reproduced before: Transformer structure reproduction__attention is all you need (pytorch)
encoder-decoder structure:
Attention模块:
边栏推荐
猜你喜欢
机器学习-特征映射方法
Teach you to locate online MySQL slow query problem hand by hand, package teaching package meeting
MySQL最大建议行数2000w, 靠谱吗?
00 -- jieba分词
VMware workstation program starts slowly
Jmeter secondary development to realize rsa encryption
定了!8月起,网易将为本号粉丝提供数据分析培训,费用全免!
APT level comprehensive free kill with Shell
基于STM32设计的老人防摔倒报警设备(OneNet)
用大白话解释“什么是ERP?” 看完这篇就全明白了
随机推荐
厌倦了安装数据库?改用 Docker
HCIP(17)
非关系型数据库MongoDB简介和部署
js基础知识整理之 —— Date和定时器
The CTF command execution subject their thinking
如何突破测试/开发程序员思维?一种不一样的感觉......
2022中国眼博会,山东眼健康展,视力矫正仪器展,护眼产品展
十年架构五年生活-05第一次出差
【代码扫描修复】MD5加密弱HASH漏洞
00 -- jieba分词
简单聊聊MySQL中的六种日志
RollBack Rx Professional RMC 安装教程
MySQL最大建议行数2000w, 靠谱吗?
2022杭电多校第一场(K/L/B/C)
2022第十一届财经峰会:优炫软件斩获双项大奖
Day117.尚医通:生成挂号订单模块
Database auditing - an essential part of network security
D with json
Speech Synthesis Model Cheat Sheet (1)
Rebound shell principle and implementation