当前位置:网站首页>Deep learning vocabulary representation
Deep learning vocabulary representation
2022-06-30 07:45:00 【Hair will grow again without it】
Word embedding(word embeddings), It is a way of language expression , The algorithm can automatically understand some similar words , Like men versus women , Like the king to the queen , There are many other examples . Through the concept of word embedding, you can build NLP Applied , Even if the training set marked by your model is relatively small .
Lexical representation
So far, we have always used vocabulary to express words , Mentioned last week glossary , May be 10000 Word , We always use one-hot Vector to represent the word . For example if man It's the first in the dictionary 5391 individual , Then it can be expressed as a vector , Only in the first 5391 Place for 1, We use it 𝑂5391 Represents this quantity , there 𝑂 representative one-hot. A big drawback of this representation is that it isolates every word , In this way, the generalization ability of the algorithm to related words is not strong .
for instance , If you have learned a language model , When you see “I want a glass of orange ___”, So what will be the next word ? Probably juice. Even if your learning algorithm has been learned “I want a glass of orange juice” Such a possible sentence , But if you see “I want a glass of apple ___”, because Algorithm not known apple and orange Our relationship is very close , It's like man and woman,king and queen equally . So the algorithm is difficult to learn from what is already known orange juice Is a common thing , And understand apple juice It is also a very common thing or a common sentence . This is because any two one-hot The inner products of vectors are 0, If you take two vectors , such as king and queen, Then calculate their inner product , The result is 0. If you use apple and orange To calculate their inner product , The result is also 0. It's hard to tell the difference between them , Because the inner products of these vectors are the same , So I can't know apple and orange than king and orange, perhaps queen and orange More similar .
If we don't one-hot Express , It is Express in characterized terms To denote each word ,man,woman,king,queen,apple,orange Or any word in the dictionary , We learn the characteristics or values of these words .
for instance , For these words , For example, we want to know these words and Gender( Gender ) The relationship between . Assume that the male gender is -1, The gender of women is +1, that man The gender value of may be -1, and woman Namely 1. Finally, according to experience king Namely -0.95,queen yes +0.97,apple and orange There is no gender . Another feature can be how... These words are Royal( noble ), So these words ,man,woman It has nothing to do with nobility , So their eigenvalues are close to 0. and king and queen It's very noble ,apple and orange It has nothing to do with nobility .
So you can think of a lot of features , To illustrate , We Suppose there is 300 Different characteristics , So you have this list of numbers , I only write here 4 individual , It's actually 300 A digital , such It makes up a 300 Vector representation of dimensions man The word . Next , I want to use 𝑒5391 This symbol represents , Just like this. . The same one 300 Dimension vector , I use 𝑒9853 On behalf of the 300 A vector of dimensions is used to represent woman The word , These other examples are the same . Now? , If Use this expression to express apple and orange These words , that apple and orange This expression of is bound to be very similar , Maybe some characteristics are different , because orange Color and taste ,apple Color and taste , Or some other characteristics will be different , But on the whole apple and orange Most of the features of are actually the same , Or they all have similar values . In this way, we already know orange juice Your algorithm will probably understand apple juice This thing , In this way, the algorithm will be generalized better for different words .
If we can learn one 300 The eigenvectors of the dimensions , Or say 300 The word of dimension is embedded , Usually we can do one thing , Put this 300 Dimensional data is embedded into a two-dimensional space , So you can visualize . The commonly used visualization algorithm is t-SNE Algorithm , From Laurens van der Maaten and Geoff Hinton The paper of . If you look at this embedded representation of words , You'll find that man and woman These words come together ,king and queen Gather together , These are people , Also gathered together . All the animals gathered together , The fruit also gathered together , image 1、 2、3、4 These numbers also come together . If you take these creatures as a whole , They also gathered together .
This word embedding algorithm for similar concepts , The characteristics learned are similar , When visualizing these concepts , These concepts are similar , Finally, they are mapped into similar eigenvectors . This kind of expression is stay 300 Feature representation in dimensional space , It's calledThe embedded(embeddings).
The reason why it is called embedding is , You can imagine a 300 Space of dimension , I can't draw it 300 Space of dimension , Here's a 3 The substitution of dimension . Now take each word, for example orange, It corresponds to a 3 The eigenvectors of the dimensions , therefore The word is embedded here
individual 300 At a point in dimensional space ,apple The word is embedded in this 300 At another point in dimensional space . For visualization ,t-SNEThe algorithm maps this space to a low dimensional space , You can draw one 2 Dimensional image and then observe , This is the source of the term embedding .
边栏推荐
- 1162 Postfix Expression
- Processes, jobs, and services
- January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)
- Distance from point to line
- Examen final - notes d'apprentissage PHP 5 - Tableau PHP
- right four steps of SEIF SLAM
- 想转行,却又不知道干什么?此文写给正在迷茫的你
- 342 maps covering exquisite knowledge, one of which is classic and pasted on the wall
- C51 minimum system board infrared remote control LED light on and off
- Investment and financing analysis report of Supply Chain & logistics industry in 2021
猜你喜欢

Three software installation methods

Cadence innovus physical implementation series (I) Lab 1 preliminary innovus

Global digital industry strategy and policy observation in 2021 (China Academy of ICT)

深度学习——语言模型和序列生成

2022 retail industry strategy: three strategies for consumer goods gold digging (in depth)

right four steps of SEIF SLAM

Multi whale capital: report on China's education intelligent hardware industry in 2022

December 4, 2021 - Introduction to macro genome analysis process tools

Inversion Lemma

Deloitte: investment management industry outlook in 2022
随机推荐
Spring Festival inventory of Internet giants in 2022
January 23, 2022 [reading notes] - bioinformatics and functional genomics (Chapter 6: multiple sequence alignment)
Cross compile opencv3.4 download cross compile tool chain and compile (3)
Cadence physical library lef file syntax learning [continuous update]
1163 Dijkstra Sequence
期末复习-PHP学习笔记7-PHP与web页面交互
期末复习-PHP学习笔记8-mysql数据库
Disk space, logical volume
Multi whale capital: report on China's education intelligent hardware industry in 2022
2021-10-29 [microbiology] qiime2 sample pretreatment form automation script
Three software installation methods
Deep learning - residual networks resnets
C51 minimum system board infrared remote control LED light on and off
深度学习——序列模型and数学符号
DXP shortcut key
Processes, jobs, and services
深度学习——GRU单元
November 22, 2021 [reading notes] - bioinformatics and functional genomics (Section 5 of Chapter 5 uses a comparison tool similar to blast to quickly search genomic DNA)
Network, network card and IP configuration
Final review -php learning notes 3-php process control statement



