当前位置：网站首页>Yyds dry goods inventory embedded matrix

Yyds dry goods inventory embedded matrix

2022-07-05 03:04:00 【LolitaAnn】

This article is my note sharing , The content mainly comes from teacher Wu Enda's in-depth learning course .^[1]

Embedding matrix

We talked about it earlier word embedding. In this section, let's talk about embedding matrix.

In the process of learning , In fact, I will one hot The pile of vectors of is transformed into a embedding Matrix .

That is to say, what you finally get should be a embedding Matrix , Not every word vector .

Let's continue our previous example , Suppose your dictionary is 1 ten thousand . Now you do word embedding choice 300 Features .

Then what you finally get should be a 300×10000 The matrix of dimensions .

\begin{array}{c|cccccc} & \begin{array}{c} \text { Man } \\ (5391) \end{array} &\begin{array}{c} \text { Woman } \\ (9853) \end{array} \\ \hline \text { Gender } & -1 & 1 \\ \text { Royal } & 0.01 & 0.02 \\ \text { Age } & 0.03 & 0.02 \\ \text { Food } & 0.04 & 0.01\\ \text { ... } & ... & ... \end{array}

#yyds Dry inventory # Embedding matrix_ Official website

We call this matrix word embedding The big matrix of $E$ .

from Embedding matrix Get words in e vector

Method 1

Remember ours one-hot Vector . The length of the vector is the length of the word list . The position of the word in the alphabet in the vector is 1, The rest of the figures are 0.

So I want to get . Of a word word embedding, We just need to put one-hot Multiply our vector by word embedding The matrix of .

[E]_{300\times10000} \times [o_{5391}]_{10000 \times 1} = [e_{5391}]_{300 \times 1}

So we can get the word in the corresponding position e vector .

Those who have studied linear algebra in the above formula should not be difficult to understand .

Let's take a simple example .

\begin{bmatrix} 1 &0 &0 \\ 0 &1 &0 \\ 0 &0 &1 \end{bmatrix} \times \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix} = \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}

Method 2

Theoretically speaking, the above method is feasible , But it is not so used in practical applications . In practical application, we usually use a function to directly create word word embedding To find , Find the vector of the corresponding position .

In practice, you will never use one 10000 Of length word embedding, Because it's really a little short . So you should be exposed to a very large matrix . If a large matrix is multiplied by a super long one-hot If the vectors of are multiplied , Its computational overhead is very large .