当前位置:网站首页>softmax的近似之NCE详解
softmax的近似之NCE详解
2022-07-03 03:13:00 【ChaoFeiLi】
接触对比学习的时候,看到了NCE loss,怕这个博客消失,所以自己特意过来记录。
深度学习中与分类相关的问题都会涉及到softmax的计算。当目标类别较少时,直接用标准的softmax公式进行计算没问题,当目标类别特别多时,则需采用估算近似的方法简化softmax中归一化的计算。
以自然语言中的语言模型为例,从理论到实践详解基于采样的softmax的近似方法NCE。
理论回顾
逻辑回归和softmax回归是两个基础的分类模型,它们都属于线性模型。前者主要处理二分类问题,后者主要处理多分类问题。事实上softmax回归是逻辑回归的一般形式。
Logistic Regression
逻辑回归的模型(函数/假设)为:
Softmax Regression
边栏推荐
- Check log4j problems using stain analysis
- I2C subsystem (I): I2C spec
- Force deduction ----- the minimum path cost in the grid
- 45 lectures on MySQL [index]
- Opengauss database development and debugging tool guide
- Destroy the session and empty the specified attributes
- Thunderbolt Chrome extension caused the data returned by the server JS parsing page data exception
- Chart. JS multitooltip tag - chart js multiTooltip labels
- 900W+ 数据,从 17s 到 300ms,如何操作
- Parameter index out of range (1 > number of parameters, which is 0)
猜你喜欢
Spark on yarn resource optimization ideas notes
Sous - système I2C (IV): débogage I2C
C语言初阶-指针详解-庖丁解牛篇
为什么线程崩溃不会导致 JVM 崩溃
I2C subsystem (III): I2C driver
Idea format code idea set shortcut key format code
[principles of multithreading and high concurrency: 1_cpu multi-level cache model]
On the adjacency matrix and adjacency table of graph storage
MySQL Real combat 45 [SQL query and Update Execution Process]
Why does thread crash not cause JVM crash
随机推荐
[principles of multithreading and high concurrency: 1_cpu multi-level cache model]
MySql实战45讲【SQL查询和更新执行流程】
How to select the minimum and maximum values of columns in the data table- How to select min and max values of a column in a datatable?
The base value is too large (the error is marked as "08") [duplicate] - value too great for base (error token is'08') [duplicate]
Stop using system Currenttimemillis() takes too long to count. It's too low. Stopwatch is easy to use!
Thunderbolt Chrome extension caused the data returned by the server JS parsing page data exception
为什么线程崩溃不会导致 JVM 崩溃
ComponentScan和ComponentScans的区别
Docker install redis
Installation and use of memory leak tool VLD
Why does thread crash not cause JVM crash
[shutter] monitor the transparency gradient of the scrolling action control component (remove the blank of the top status bar | frame layout component | transparency component | monitor the scrolling
[algebraic structure] group (definition of group | basic properties of group | proof method of group | commutative group)
C # general interface call
Agile certification (professional scrum Master) simulation exercise-2
Pytoch configuration
MySql实战45讲【事务隔离】
Pat class B common function Usage Summary
VS code配置虚拟环境
C语言初阶-指针详解-庖丁解牛篇