当前位置:网站首页>[pytorch] pytorch automatic derivation, Tensor and Autograd
[pytorch] pytorch automatic derivation, Tensor and Autograd
2022-07-31 16:25:00 【Enzo wants to smash the computer】
在神经网络中,一个重要内容就是进行参数学习,而参数学习离不开求导.
现在大部分深度学习架构都有自动求导的功能,torch.autograd包 就是用来自动求导的.
torch.autograd Packages are on tensors 所有的操作 提供了自动求导功能
Learn and record this one 自动求导 的要点.
一、计算图
throughout the forward calculation process,PyTorch采用计算图的形式进行组织,The calculation diagram is动态图,and every time 前向传播时,将重新构建.其他深度学习架构,如TensorFlow、Keras一般为静态图.

- 计算图是一种有向无环图像,Graphically represent the relationship between operators and variables,直观高效.
- 图中 圆形表示变量,矩阵表示算子
- 表达式:z=wx+b,可写成两个表示式: y=wx,z=y+b,
- 其中x、w、b为变量,是用户创建的变量,不依赖于其他变量,故又称 为叶子节点.为计算各叶子节点的梯度,需要把对应的张量参数requires_grad属性设置为 True,这样就可自动跟踪其历史记录.(后面会细说)
- y、z 是计算得到的变量,非叶子节点,z为根节点
- mul和add是算子(或操作或函数)
These variables and operators constitute a complete calculation process (或前向传播过程)
二、自动求导要点
为实现对Tensor自动求导,需考虑如下事项:
1)创建叶子节点(Leaf Node)的Tensor,使用requires_gradThe parameter specifies whether to log it 的操作,以便之后利用backward()方法进行梯度求解.requires_grad参数的缺省值为 False,如果要对其求导需设置为True,Then the node that has a dependency on it automatically becomesTrue.
2)可利用requires_grad_()方法修改Tensor的requires_grad属性(For example, at the beginning of the training phase,requires_grad 值设置为了True,Modified in the testing phase to False).可以调用.detach()或 with torch.no_grad():,将不再计算张量的梯度,跟踪张量的历史记录.这点在Evaluation mode 型、Test model stage中常常用到.
3)通过created by the operationTensor(即非叶子节点),会automatically assignedgrad_fn属性.the property sheet shows the gradient function.叶子节点的grad_fn为None.
4)最后得到的Tensor(根节点)执行backward()函数,此时自动计算各变量的梯度.
- Each backpropagation ends,The gradient of leaf nodes will be cleared.If multiple backpropagation gradient accumulations are required,需要指定backward 中的参数retain_graph=True,In this way, the gradients of the child nodes are cumulative.
- 非叶子节点的梯度backward调用后即被清空
5)backward()函数接收参数,该参数应和调用backward()函数的Tensor的维度相同, 或者是可broadcast的维度.如果求导的Tensor为标量(即一个数字),则backward中的参数可省略.
三、Calculation of scalar backpropagation

- 假设x、w、b都是标量,则计算结果 z 也是标量 (z=wx+b)
- 对根节点z调用backward()方法,我们无须对 backward()传入参数
* 这里先提一嘴,It will be mentioned later: 如果目标张量对一个非标量调用backward(),则需要传入一个 gradient参数,该参数也是张量,而且需要与调用backward()的张量形状相同.
以下是实现自动求导的主要步骤:
import torch
# 输入张量 x
x = torch.Tensor([2])
# 初始化 权重参数w, 偏移量b,并设置 require_grad 属性为 True, 为自动求导
w = torch.randn(1, requires_grad=True)
b = torch.randn(1, requires_grad=True)
# Implement forward propagation
y = torch.mul(w, x)
z = torch.add(y, b)
# View leaf nodes separately x, w, b 和 非叶子节点 y、z 的require_grad属性
print(x.requires_grad, w.requires_grad, b.requires_grad) # False True True
print(y.requires_grad, z.requires_grad ) # True True
# 查看各节点是否为叶子节点
print(x.is_leaf, w.is_leaf, b.is_leaf, y.is_leaf, z.is_leaf) # True True True False False
# 分别查看 叶子节点 和 非叶子节点 的 grad_fn 属性
print(x.grad_fn, w.grad_fn, b.grad_fn) # None None None
print(y.grad_fn, z.grad_fn) # <MulBackward0 object at 0x7f8ac1303910> <AddBackward0 object at 0x7f8ac1303070>
z.backward() # Gradients do not accumulate
# z.backward(retain_graph=True) # 如果多次使用backward,Gradient accumulation is required,则需要修改参数retain_graph为True
# 查看叶子节点的梯度,x是叶子节点但它无须求导,故其梯度为None
print(w.grad,b.grad,x.grad) # tensor([2.]) tensor([1.]) None
#非叶子节点的梯度,执行backward之后,会自动清空
print(y.grad,z.grad) # None None
四、Computation of nonscalar backpropagation
边栏推荐
- SHELL内外置命令
- The 2nd China PWA Developer Day
- After the form is submitted, the page does not jump [easy to understand]
- 上传图片-微信小程序(那些年的坑记录2022.4)
- 基于ABP实现DDD
- 【7.28】代码源 - 【Fence Painting】【合适数对(数据加强版)】
- Small program: Matlab solves differential equations "recommended collection"
- 牛客网刷题(二)
- Single-cell sequencing workflow (single-cell RNA sequencing)
- 更新数据表update
猜你喜欢

研发过程中的文档管理与工具

基于C语言的编译器设计与实现

.NET 20th Anniversary Interview - Zhang Shanyou: How .NET technology empowers and changes the world

Foreign media right, apple on May be true in inventory

Unity 之 图集属性详解和代码示例 -- 拓展一键自动打包图集工具

gerrit中如何切换远程服务器

type of timer

【C语言】LeetCode27.移除元素

Implementing DDD based on ABP

Qt实战案例(54)——利用QPixmap设计图片透明度
随机推荐
.NET 20th Anniversary Interview - Zhang Shanyou: How .NET technology empowers and changes the world
Use of radiobutton
what exactly is json (c# json)
OPPO在FaaS领域的探索与思考
After Grafana is installed, the web opens and reports an error
Flutter 获取状态栏statusbar的高度
复制延迟案例(1)-最终一致性
What is the difference between BI software in the domestic market?
Summary of the implementation method of string inversion "recommended collection"
牛客 HJ18 识别有效的IP地址和掩码并进行分类统计
7. Summary of common interview questions
Visualize GraphQL schemas with GraphiQL
牛客网刷题(三)
[TypeScript] In-depth study of TypeScript type operations
Design and Implementation of Compiler Based on C Language
Snake Project (Simple)
ansible学习笔记02
ansible study notes 02
贪吃蛇项目(简单)
WPF project - basic usage of controls entry, you must know XAML