当前位置:网站首页>Automatic derivation of pytorch
Automatic derivation of pytorch
2022-07-24 15:23:00 【Strong learning】
List of articles
1. Basic concepts
1.1 requires_grad
If you need to derive a tensor , Then the value must be assigned during initialization requires_grad=True, for example
a = torch.tensor(2.3, requires_grad=True)
1.2 Calculation chart
In the calculation chart , Data is represented by ellipses , Operations such as addition, subtraction, multiplication and division are represented by a rectangle . The data and operations are represented as a binary tree structure through the calculation diagram .
1.3 Leaf node
In the calculation chart , The data created by users themselves is called leaf nodes , Like the one above w, x, b, It can also be said that the data that has not been calculated is leaf nodes . It can be used a.is_leaf( Note that attributes , Not the way ) Determine whether it is a leaf node .
print(a.is_leaf)
1.4 grad_fn
z.grad_fn The output is <AddBackward0 at 0x8ea34d2cd342>, It means z Corresponding Direct operation .
tensor Obtained by an operation , stay PyTorch The back propagation function of each operation is already defined , such as z By add That is, the result of the addition operation , that z.grad_fn What you get is add Back propagation function of function ( Derivative function ). Notice what we get is AddBackward0 Behind a 0, It shows that a calculation diagram can appear many times add, Every add The back propagation function of is different .
1.5 next_functions
z.grad_fn.next_functions The output is
(<AccumulateGrad at 0x7fb73c7cdad0>, 0L))
((<MulBackward0 at 0x7fb73c7cd7d0>, 0L),
z By add The result of the operation is , that add The input to the operation is b and y, The output is b.grad_fn and y.grad_fn.
AccumulateGrad What is it? ?a.grad What is it? ? Why should the gradient be set 0?
about y.grad_fn We can know MulBackward It is the back propagation function corresponding to the product operation .
b Just a leaf node , It's a Tensor, its grad_fn namely Accumlate_Grad Express this b The derivative of is cumulative . For example, the first time you spread in the direction , We come to the conclusion that b The derivative of is 3, namely a.grad by 3, But you take the derivative again , You will find a.grad by 6, This is called cumulative . So in Pytorch Inside , every last batch That is, the gradient will be reduced before each back propagation, that is grad All are set for 0.
1.6 retain_graph=True backward()
z.backward(retain_graph=True)
z.backward() From z The answer is z The derivative of each variable .
retain_graph=True Means to save intermediate variables . Let's calculate z Yes w The derivative of is found to be y, Pay attention to this y In the calculation diagram of our example above, it is not specified by ourselves , It's from the middle , For the first time z.backward() seek z Yes w The derivative of will take y Value . But if we want to spread it once after this spread , So it's a mistake , Because a gradient game will automatically release the computing things in the middle , That is, there was no... In the second transmission y 了 , Unless you spread it forward again . So we can specify this in advance to ensure that the intermediate variable still exists after the first propagation .
1.7 hook function
The derivation of non leaf nodes will be released , If you want to see its derivative , It can be used autograd.grad perhaps hook function .
2. summary
About autograd, What we need to know is that we can create tensor When it's time to specify requires_grad = True Make differentiable , Then in the final function, use z.backward(). use a.grad View derivatives ( gradient ).
边栏推荐
- MySql函数
- Applet tab
- Performance test - Preparation of test plan
- 【TA-霜狼_may-《百人计划》】图形3.4 延迟渲染管线介绍
- Vector introduction and underlying principle
- 什么是防火墙?防火墙能发挥什么样的作用?
- VAE(变分自编码器)的一些难点分析
- Route planning method for UAV in unknown environment based on improved SAS algorithm
- Error when using Fiddler hook: 502 Fiddler - connection failed
- 你不能只会flex居中布局,精制动画讲解所有flex布局方式!通俗易懂纯干货教程!...
猜你喜欢

2022 robocom world robot developer competition - undergraduate group (provincial competition) -- question 2: intelligent medication assistant (finished)

Intelligent operation and maintenance scenario analysis: how to detect abnormal business system status through exception detection

Spark: get the access volume of each time period in the log (entry level - simple implementation)

Vector introduction and underlying principle
![[USENIX atc'22] an efficient distributed training framework whale that supports the super large-scale model of heterogeneous GPU clusters](/img/dc/be4dc55cdf3085a7b9e58ed6d6a16e.png)
[USENIX atc'22] an efficient distributed training framework whale that supports the super large-scale model of heterogeneous GPU clusters

Discussion on the basic use and address of pointer in array object

2022 robocom world robot developer competition - undergraduate group (provincial competition) -- fifth question tree and bipartite diagram (completed)

Leetcode-09 (next rank + happy number + full rank)

kubernetes多网卡方案之Multus_CNI部署和基本使用

【USENIX ATC'22】支持异构GPU集群的超大规模模型的高效的分布式训练框架Whale
随机推荐
Discussion on the basic use and address of pointer in array object
27. Directory and file system
Is it safe for Huatai Securities to open an account? I don't know how to operate it
遭受DDoS时,高防IP和高防CDN的选择
C. Recover an RBS
Learning rate adjustment strategy in deep learning (1)
被攻击怎么解决?DDoS高防IP防护策略
DS graph - minimum spanning tree
【Bug解决】Win10安装pycocotools报错
Storage and traversal of Graphs
DDD based on ABP -- Entity creation and update
Multus of kubernetes multi network card scheme_ CNI deployment and basic use
Tiger mouth waterfall: Tongliang version of xiaohukou waterfall
MySQL function
Chiitoitsu
Outlook tutorial, how to set rules in outlook?
Use of keywords const, volatile and pointer; Assembly language and view of register status
Overall testing framework for performance testing
[matlab] matlab drawing Series II 1. Cell and array conversion 2. Attribute cell 3. delete Nan value 4. Merge multiple figs into the same Fig 5. Merge multiple figs into the same axes
4279. 笛卡尔树