当前位置：网站首页>NLP常用Backbone模型小抄(1)

NLP常用Backbone模型小抄(1)

2022-08-02 22:43:00 【Andy Dennis】

自17年Transformer出现以来，NLP各大工作都出现它的身影。最近，斯坦福还专门为transformer开了一门课程CS25: 【Stanford】CS25 Transformers United | Fall 2021

刚入门NLP的人可以看看我之前写的一篇文章研0_NLPer启程

对于对应的模型，可以去hugginface的transfomers库看看 transformers/models (github)，可以找到对应模型看看它的源码实现。

现在主要是结合上下文的动态词向量编码技术，很少使用word2vec, glove词表进行静态词向量映射了。

encoder结构。bert家族很多，如蒸馏版本distilBert, 变体Roberta等。

词向量输入构成:

著名的self-attention就出自这篇文章.
Attention Is All You Need

encoder-decoder结构:

Attention模块:

版权声明
本文为[Andy Dennis]所创，转载请带上原文链接，感谢
https://blog.csdn.net/weixin_43850253/article/details/126070768