当前位置:网站首页>Learning notes [Gumbel softmax]
Learning notes [Gumbel softmax]
2022-07-01 19:24:00 【hei_ hei_ hei_】
gumbel softmax
Used for processing argmax Non differentiable case
Solutions : introduce gumbel Distribution . Use... In forward propagation argmax, Used in backward gradient return gumbel_softmax Calculation
Code
def gumbel_softmax(logits: Tensor, tau: float = 1, hard: bool = False, eps: float = 1e-10, dim: int = -1) -> Tensor:
...
gumbels = (
-torch.empty_like(logits, memory_format=torch.legacy_contiguous_format).exponential_().log()
) # ~Gumbel(0,1)
gumbels = (logits + gumbels) / tau # ~Gumbel(logits,tau)
y_soft = gumbels.softmax(dim)
if hard:
# Straight through.
index = y_soft.max(dim, keepdim=True)[1]
y_hard = torch.zeros_like(logits, memory_format=torch.legacy_contiguous_format).scatter_(dim, index, 1.0)
ret = y_hard - y_soft.detach() + y_soft
else:
# Reparametrization trick.
ret = y_soft
return ret
gumbel_softmax Temperature is introduced in t, t The smaller it is ,softmax The closer you get to One-hot. To train stability , commonly t Will take a larger number , Then gradually shrink .
The content is reproduced from gumbel softmax
边栏推荐
- Gameframework eating guide
- Taiaisu M source code construction, peak store app premium consignment source code sharing
- transform + asm资料
- SuperOptiMag 超导磁体系统 — SOM、SOM2 系列
- M91 fast hall measuring instrument - better measurement in a shorter time
- Contos 7 set up SFTP to create users, user groups, and delete users
- 助力数字经济发展,夯实数字人才底座—数字人才大赛在昆成功举办
- Graduation summary
- June issue | antdb database participated in the preparation of the "Database Development Research Report" and appeared on the list of information technology and entrepreneurship industries
- Improve yolov5 with gsconv+slim neck to maximize performance!
猜你喜欢
pickle.load报错【AttributeError: Can‘t get attribute ‘Vocabulary‘ on <module ‘__main__‘】
The best landing practice of cave state in an Internet ⽹⾦ financial technology enterprise
[pytorch record] automatic hybrid accuracy training torch cuda. amp
Docker deploy mysql8.0
Intensive cultivation of channels for joint development Fuxin and Weishi Jiajie held a new product training conference
华为游戏初始化init失败,返回错误码907135000
Improve yolov5 with gsconv+slim neck to maximize performance!
C端梦难做,科大讯飞靠什么撑起10亿用户目标?
见证时代!“人玑协同 未来已来”2022弘玑生态伙伴大会开启直播预约
Lake Shore低温恒温器的氦气传输线
随机推荐
【Go ~ 0到1 】 第四天 6月30 defer,结构体,方法
Dom4J解析XML、Xpath检索XML
寶,運維100+服務器很頭疼怎麼辦?用行雲管家!
Intensive cultivation of channels for joint development Fuxin and Weishi Jiajie held a new product training conference
Technical secrets of ByteDance data platform: implementation and optimization of complex query based on Clickhouse
【快应用】text组件里的文字很多,旁边的div样式会被拉伸如何解决
Cache problems after app release
混沌工程平台 ChaosBlade-Box 新版重磅发布
Lake shore optimag superconducting magnet system om series
PostgreSQL varchar[] 数组类型操作
VBA simple macro programming of Excel
Dlib+opencv library for fatigue detection
Example explanation: move graph explorer to jupyterlab
Lake Shore M91快速霍尔测量仪
Superoptimag superconducting magnet system - SOM, Som2 series
MySQL常用图形管理工具 | 黑马程序员
Golang error handling
微服务大行其道的今天,Service Mesh是怎样一种存在?
Learn MySQL from scratch - database and data table operations
M91快速霍尔测量仪—在更短的时间内进行更好的测量