当前位置:网站首页>Some superficial understanding of word2vec
Some superficial understanding of word2vec
2022-07-07 10:29:00 【strawberry47】
Recently, a friend asked word2vec What's the matter , So I reviewed the relevant knowledge again , Record some of your thoughts , Prevent forgetting ~
word2vec Is the means to obtain word vectors , It's in NNLM Improved on the basis of .
The training model is essentially a neural network with only one hidden layer .
It comes in two forms ① skip-gram: Predict the middle from both sides ② C-BOW: Predict both sides from the middle ;
Be careful , These two forms only represent two different training methods , Finally, the input layer is taken -> The weight of the hidden layer , As word vector .
During training , With CBOW For example , Suppose the corpus is “ It's a fine day today ”; The input to the model is " today God Of God really good " Six word one-hot vector, The output is a bunch of probabilities , We hope “ gas ” The probability of occurrence is the greatest .
When writing code , Usually called gensim library , The word vector can be trained by inputting the corpus .
Some small training trick:Negative Sampling, Huffman tree
Reference resources :[NLP] Second vector Word2vec The essence of , summary word2vec( Blog written by lab senior brother )
边栏推荐
- About hzero resource error (groovy.lang.missingpropertyexception: no such property: weight for class)
- 555电路详解
- IIC基本知识
- 基于gis三维可视化技术的智慧城市建设
- Weekly recommended short videos: what are the functions of L2 that we often use in daily life?
- IPv4 socket address structure
- LLVM之父Chris Lattner:为什么我们要重建AI基础设施软件
- LeetCode 练习——113. 路径总和 II
- Multisim--软件相关使用技巧
- [STM32] solution to the problem that SWD cannot recognize devices after STM32 burning program
猜你喜欢
Programming features of ISP, IAP, ICP, JTAG and SWD
Leetcode exercise - 113 Path sum II
P1031 [NOIP2002 提高组] 均分纸牌
【acwing】789. 数的范围(二分基础)
Inno Setup 打包及签名指南
深入分析ERC-4907协议的主要内容,思考此协议对NFT市场流动性意义!
Remote meter reading, switching on and off operation command
JMeter loop controller and CSV data file settings are used together
MySQL insert data create trigger fill UUID field value
1323:【例6.5】活动选择
随机推荐
MCU is the most popular science (ten thousand words summary, worth collecting)
IPv4 socket address structure
The width of table is 4PX larger than that of tbody
High number_ Chapter 1 space analytic geometry and vector algebra_ Quantity product of vectors
Study summary of postgraduate entrance examination in October
使用U2-Net深层网络实现——证件照生成程序
【acwing】786. Number k
Jump to the mobile terminal page or PC terminal page according to the device information
深入分析ERC-4907协议的主要内容,思考此协议对NFT市场流动性意义!
Several schemes of building hardware communication technology of Internet of things
嵌入式工程师如何提高工作效率
P2788 数学1(math1)- 加减算式
【剑指Offer】42. 栈的压入、弹出序列
@Configuration, use, principle and precautions of transmission:
555电路详解
The method of word automatically generating directory
LLVM之父Chris Lattner:为什么我们要重建AI基础设施软件
Elegant controller layer code
【acwing】789. Range of numbers (binary basis)
leetcode-304:二维区域和检索 - 矩阵不可变