当前位置:网站首页>Four kinds of hooks in deep learning
Four kinds of hooks in deep learning
2022-06-13 08:51:00 【Human high quality Algorithm Engineer】
In order to save video memory ( Memory ),pytorch Do not save intermediate variables during calculation , Including the characteristic graph of the middle layer and the gradient of the non leaf tensor . Sometimes it is necessary to view or modify these intermediate variables when analyzing the network , You need to register a hook (hook) To export the required intermediate variables . There are many online introductions to this , But I looked around , There are many inaccuracies or incomprehensible places , Let me sum up here , Give the actual usage and notes .
hook There are four ways :
torch.Tensor.register_hook()
torch.nn.Module.register_forward_hook()
torch.nn.Module.register_backward_hook()
torch.nn.Module.register_forward_pre_hook().
The first one is torch.Tensor.register_hook()
import torch
def grad_hook(grad):
grad *= 1.6
x = torch.tensor([1., 1., 1., 1.], requires_grad=True)
y = torch.pow(x, 2)
z = torch.sum(y)
h = x.register_hook(grad_hook)
z.backward()
print(x.grad)
h.remove() # removes the hook
The result is :
tensor([3.2000, 3.2000, 3.2000, 3.2000])
import torch
def grad_hook(grad):
grad *= 50
x = torch.tensor([2., 2., 2., 2.], requires_grad=True)
y = torch.pow(x, 2)
z = torch.mean(y)
h = x.register_hook(grad_hook)
z.backward()
print(x.grad)
h.remove() # removes the hook
>>> tensor([50., 50., 50., 50.])
How is this value calculated , In fact, it is in the time of back propagation , obtain x Gradient of , The above code not only obtains x Gradient of , And multiply it by 1.6, Can change the value of the gradient .
notes :
It can be used remove() Method cancel hook. Be careful remove() Must be in backward() after , Because only in execution backward() When the sentence is ,pytorch Just started to calculate the gradient , And in the x.register_hook(grad_hook) When it's just " register " One. grad The hook , There is no calculation at this time , And perform remove Just cancel the hook , And then again backward() The hook doesn't work .
The second kind torch.nn.Module.register_forward_hook(module, in, out)
Used to export the specified sub module ( Can be a layer 、 Module etc. nn.Module type ) The input-output tensor of , But only the output can be modified , It is often used to derive or modify convolution characteristic graph .
inps, outs = [],[]
def layer_hook(module, inp, out):
inps.append(inp[0].data.cpu().numpy())
outs.append(out.data.cpu().numpy())
hook = net.layer1.register_forward_hook(layer_hook)
output = net(input)
hook.remove()
Be careful :(1) Because modules can be multi input , So the input is tuple Type , You need to extract the Tensor Then operate ; The output is Tensor Type can be used directly .
(2) Do not put it on the video memory after exporting , Unless you have A100.
(3) Only the output can be modified out Value , Cannot modify input inp Value ( Can't return , Local modifications are also invalid ), It is better to use return returns , Such as :
def layer_hook(self, module, inp, out):
out = self.lam * out + (1 - self.lam) * out[self.indices]
return out
This code is used in manifold mixup in , It is used to mix the features of the middle layer to achieve data enhancement , among self.lam It's a [0,1] Probability value ,self.indices yes shuffle The serial number after .
3, torch.nn.Module.register_forward_pre_hook(module, in)
Used to export or modify the input tensor of the specified sub module .
def pre_hook(module, inp):
inp0 = inp[0]
inp0 = inp0 * 2
inp = tuple([inp0])
return inp
hook = net.layer1.register_forward_pre_hook(pre_hook)
output = net(input)
hook.remove()
Be careful :(1)inp Value is a tuple type , So we need to extract the tensor first , Do something else , And then it has to be transformed into tuple return .
(2) In execution output = net(input) This sentence is called only when ,remove() It can be used to cancel the hook after the call .
4, torch.nn.Module.register_backward_hook(module, grad_in, grad_out)
Used to derive the gradient of the input-output tensor of the specified sub module , But only the gradient of the input tensor can be modified ( That is, it can only return gin), The output tensor gradient is not modifiable .
gouts = []
def backward_hook(module, gin, gout):
print(len(gin),len(gout))
gouts.append(gout[0].data.cpu().numpy())
gin0,gin1,gin2 = gin
gin1 = gin1*2
gin2 = gin2*3
gin = tuple([gin0,gin1,gin2])
return gin
hook = net.layer1.register_backward_hook(backward_hook)
loss.backward()
hook.remove()
Be careful :
(1) Among them grad_in and grad_out All are tuple, Must be untied first , When modifying, perform the operation and then put it back tuple return .
(2) This hook function is in backward() Statement is called , therefore remove() Put it on backward() Then it is used to cancel the hook .
边栏推荐
- GBase 8a V95与V86压缩策略类比
- À propos des principes de chiffrement et de décryptage RSA
- Mobile terminal development I: basic concepts
- 浅析Visual Studio 使用
- Problems in the deconstruction and assignment of objects, comparison between empty strings and undefined
- Object array de encapsulation
- [leetcode weekly race record] record of the 80th biweekly race
- DIY无人机(匿名拓控者P2+F330机架)
- Verify the word limit of textarea input box. Only prompt but no submission limit
- Mapbox loads nationwide and provincial range, displaying multi-color animation points, migration lines, 3D histogram, etc
猜你喜欢
5. Attribute selector
JS - max. of array cases
JS array using the reduce() method
Knowledge points related to system architecture 1
File upload JS
顺时针打印个数组
[pychart pit stepping record] prompt after configuring remote operation, [errno 2] no such file or directory
0. Quelques doutes au sujet de SolidWorks
Is signed or unsigned selected to create an integer field in MySQL? The answer is as follows:
Form exercise 2
随机推荐
Pop component submission success failure animation
Margin:0 reason why auto does not take effect
Paging query template of Oracle
Undefined and null in JS
JS to get the date in the next seven days of the current date
Three methods to make the scroll bar of div automatically scroll to the bottom
Explanation of JS event loop mechanism and asynchronous tasks
5. Attribute selector
JS gets the first month of the year, the second month to the last month, and the first day to the last day
5、 Constant, variable
抖音关键词搜索列表接口,超详细的接口对接步骤
turf. JS usage
JS obtain geographic location information according to longitude and latitude and mark it on the map
Gbase 8A v95 vs v86 compression strategy analogy
Tiktok keyword search list interface, ultra detailed interface docking steps
Taobao commodity historical price interface / commodity historical price trend interface code docking and sharing
Wrap dynamically created child elements in dynamically created structures
Filebeat collects logs to elk
Differences among let, VaR and const when JS declares variables
Knowledge points related to system architecture 1