当前位置:网站首页>Pytorch Summary - Automatic gradient
Pytorch Summary - Automatic gradient
2022-06-29 09:21:00 【TJMtaotao】
Find the gradient of the function (gradient).PyTorch Provided autograd package Be able to lose ⼊ And the forward propagation process to automatically build the calculation diagram , And hold ⾏ Row back propagation .
Tensor Is the core class of this package , If its attribute .requires_grad Set to True , It will begin to chase
Trace (track) All operations on it ( This will benefit 利⽤ Use the chain rule to enter ⾏ That's ok 行 The gradient propagates 了). After the calculation , It can be adjusted
⽤ use .backward() To complete all gradient calculations . this Tensor The gradient of will accumulate to .grad attribute in .
Pay attention to y.backward() when , If y It's scalar 量, be 不 Need to be for backward() Pass on ⼊ Enter any parameter ; otherwise , need
Pass on ⼊ With a y Homomorphic Tensor
If 不 Want to be tracked , It can be adjusted ⽤ .detach() Separate it from the tracking record , This will prevent ⽌ Future plans
Be tracked , So the gradient can't pass 了. Besides , You can also use with torch.no_grad() Will not 不 Operation generation that wants to be tracked
Code block wrapped , This method is often used when evaluating models , Because when evaluating the model , We don't 不 Need to calculate trainable parameters
( requires_grad=True ) Gradient of .
Function It's another ⼀ A very important class . Tensor and Function Combined with each other, you can build a record with the entire calculation
Directed acyclic graph of a process (DAG). Every Tensor There are ⼀ One .grad_fn attribute , The property creates the Tensor Of
Function , That is to say Tensor yes 不 Is obtained by some operation , if , be grad_fn return ⼀ One is associated with these operations
Closed object , It is None.
TENSOR
Create a Tensor And set up requires_grad=True :
x = torch.ones(2, 2, requires_grad=True)
print(x)
print(x.grad_fn)
tensor([[1., 1.],
[1., 1.]], requires_grad=True)
None
Do another operation :
y = x + 2
print(y)
print(y.grad_fn)
tensor([[3., 3.],
[3., 3.]], grad_fn=<AddBackward>)
<AddBackward object at 0x1100477b8>
Be careful x It's created directly , So it doesn't have grad_fn , and y It's through ⼀ Created by an addition operation , So it has a name for
<AddBackward> Of grad_fn . image x This direct creation is called a leaf node , The leaf node corresponds to grad_fn yes None .
print(x.is_leaf, y.is_leaf) # True False
More complexity operations :
z = y * y * 3
out = z.mean()
print(z, out)
tensor([[27., 27.],
[27., 27.]], grad_fn=<MulBackward>) tensor(27., grad_fn=
<MeanBackward1>)
adopt .requires_grad_() To use in-place Of ⽅ Type change requires_grad attribute :
a = torch.randn(2, 2) # Default... If missing requires_grad = False
a = ((a * 3) / (a - 1))
print(a.requires_grad) # False
a.requires_grad_(True)
print(a.requires_grad) # True
b = (a * a).sum()
print(b.grad_fn)
False
True
<SumBackward0 object at 0x118f50cc0>
gradient
because out yes ⼀ A target 量, So call backward() when 不 The derivation variable needs to be specified 量:
out.backward() # Equivalent to out.backward(torch.tensor(1.))
Let's see out About x Gradient of 
print(x.grad)
tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])
We make out by o , because



and torch.autograd This package is used to calculate ⼀ The product of some Jacobian matrices .
example 例 Such as , If v yes ⼀ Of scalar functions 

So according to the chain rule, we have
About
Of Jacques ⽐ The matrix is :

Be careful :grad It is cumulative in the process of back propagation (accumulated), This means that every shipment ⾏行 Back propagation , Gradients will be tired
Plus the gradient before , therefore ⼀ Generally, the gradient needs to be cleared before back propagation .
Back propagation again , Be careful grad It's cumulative
out2 = x.sum()
out2.backward()
print(x.grad)
out3 = x.sum()
x.grad.data.zero_()
out3.backward()
print(x.grad)
tensor([[5.5000, 5.5000],
[5.5000, 5.5000]])
tensor([[1., 1.],
[1., 1.]])
边栏推荐
猜你喜欢

笔试题“将版本号从大到小排列”

How is epoll encapsulated in golang?

Training kernel switching using GPU

H5 soft keyboard problem

记微信小程序分享代码片段

调试H5页面-vConsole

Open3d farthest point sampling (FPS)

使用GPU训练kernel切换

Debugging H5 page -weinre and spy debugger real machine debugging

Augfpn: amélioration de l'apprentissage des caractéristiques à plusieurs échelles pour la détection des cibles
随机推荐
Pytorch summary learning series - operation
手写VirtualDOM
Detecting and counting tiny faces
Uniapp wechat applet reports an error typeerror: cannot read property 'call' of undefined
记微信小程序setData动态修改字段名
调试H5页面-vConsole
HB5470民用飞机机舱内部非金属材料燃烧测试
Find the most repeated element in the string
Multiplier design (pipeline) Verilog code
Written test question "arrange version numbers from large to small"
Mongodb persistence
SSD改進CFENet
NPM common commands
Verilog splicing operation symbol
Write down some written test questions
微信小程序wx.navigateBack返回上一页携带参数
Instance error iopub data rate exceeded
DevOps到底是什么意思?
MySQL uses union all to count the total number of combinations of multiple tables and the number of tables respectively
Wechat applet determines the file format of URL