当前位置:网站首页>Chapter 3 of hands on deep learning - (1) linear regression is realized from scratch_ Learning thinking and exercise answers
Chapter 3 of hands on deep learning - (1) linear regression is realized from scratch_ Learning thinking and exercise answers
2022-07-02 17:14:00 【coder_ sure】
List of articles
3.1 Linear regression
author github link : github link
Learning notes
exercises
- If we initialize the weight to zero , What's going to happen . Is the algorithm still valid ?
- Suppose you are George · Simon · ohm , This paper attempts to establish a model for the relationship between voltage and current . Can you use automatic differentiation to learn the parameters of the model ?
- You can be based on Planck's law Use spectral energy density to determine the temperature of an object ?
- If you want to calculate the second derivative, what problems may you encounter ? How would you solve these problems ?
- Why is it
squared_loss
You need to use... In the functionreshape
function ? - Try using different learning rates , Observe how fast the value of the loss function decreases .
- If the number of samples cannot be divided by the batch size ,
data_iter
What happens to the behavior of functions ?
Problem solving
1. If we initialize the weight to zero , What's going to happen . Is the algorithm still valid ?
Experiments show that the algorithm is still effective , It looks better
whole 0 Initialization is also a common choice , Compared with normal distribution initialization, it may move towards different local optima , The algorithm is still effective .
# w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)
w = torch.zeros((2,1) ,requires_grad=True)
epoch 1, loss 0.036967
epoch 2, loss 0.000132
epoch 3, loss 0.000050
4. If you want to calculate the second derivative, what problems may you encounter ? How would you solve these problems ?
The calculation formula of first-order derivative function cannot be obtained directly . resolvent : Find the first derivative and save the calculation diagram .
Example : y = x 3 + c o s x , x = π 2 , π y=x^3+cosx,x=\frac{\pi}{2},\pi y=x3+cosx,x=2π,π, Find the first derivative and the second derivative respectively
Reference resources
import torch
import math
import numpy as np
x = torch.tensor([math.pi / 2, math.pi], requires_grad=True)
y = x ** 3 + torch.cos(x)
true_dy = 3 * x ** 2 - torch.sin(x)
true_d2y = 6 * x - torch.cos(x)
# Find the first derivative , After saving the calculation diagram , To find the second derivative
dy = torch.autograd.grad(y, x,
grad_outputs=torch.ones(x.shape),
create_graph=True,
retain_graph=True) # Keep the calculation diagram for calculating the second derivative
# After the tensor, add .detach().numpy() Only tensor values can be output
print(" First derivative true value :{} \n First derivative calculation value :{}".format(true_dy.detach().numpy(), dy[0].detach().numpy()))
# Find the second derivative . above dy The first element of is the first derivative
d2y = torch.autograd.grad(dy, x,
grad_outputs=torch.ones(x.shape),
create_graph=False # No more calculation charts , Destroy the previous calculation diagram
)
print("\n Second order conduction true value :{} \n Second derivative calculation value :{}".format(true_d2y.detach().numpy(), d2y[0].detach().numpy()))
5. Why is it squared_loss
You need to use... In the function reshape
function ?
y ^ \hat{y} y^ It's a column vector , y y y It's a row vector
6. Try using different learning rates , Observe how fast the value of the loss function decreases .
Try it on your own :
- Low learning rate loss The decline is relatively slow
- Excessive learning rate loss Unable to converge
7. If the number of samples cannot be divided by the batch size ,data_iter
What happens to the behavior of functions ?
The number of samples left at the end of execution cannot be divided , Will report a mistake
边栏推荐
- 寒门再出贵子:江西穷县考出了省状元,做对了什么?
- IP地址转换地址段
- How openharmony starts FA of remote devices
- IP address translation address segment
- OpenHarmony如何启动远程设备的FA
- 剑指 Offer 27. 二叉树的镜像
- L'explosion de John utilise l'encodage d'entrée par défaut: UTF - 8 Loaded 1 password Hash (bcrypt [blowfish 32 / 64 X3])
- PWM breathing lamp
- 什么是泛型?- 泛型入门篇
- Use the API port of the bridge of knowledge and action to provide resources for partners to access
猜你喜欢
2020 "Lenovo Cup" National College programming online Invitational Competition and the third Shanghai University of technology programming competition (a sign in, B sign in, C sign in, D thinking +mst
剑指 Offer 24. 反转链表
Error when uploading code to remote warehouse: remote origin already exists
Lampe respiratoire PWM
伟立控股港交所上市:市值5亿港元 为湖北贡献一个IPO
【征文活动】亲爱的开发者,RT-Thread社区喊你投稿啦
[error record] error -32000 received from application: there are no running service protocol
Soul, a social meta universe platform, rushed to Hong Kong stocks: Tencent is a shareholder with an annual revenue of 1.28 billion
Weili holdings listed on the Hong Kong Stock Exchange: with a market value of HK $500million, it contributed an IPO to Hubei
Ap和F107数据来源及处理
随机推荐
A few lines of code to complete RPC service registration and discovery
二、mock平台的扩展
Leetcode1380: lucky numbers in matrix
酒仙网IPO被终止:曾拟募资10亿 红杉与东方富海是股东
The macrogenome microbiome knowledge you want is all here (2022.7)
AP and F107 data sources and processing
What will you do after digital IC Verification?
L'explosion de John utilise l'encodage d'entrée par défaut: UTF - 8 Loaded 1 password Hash (bcrypt [blowfish 32 / 64 X3])
js删除字符串中的子串
Sword finger offer 25 Merge two sorted linked lists
go-zero微服务实战系列(八、如何处理每秒上万次的下单请求)
[cloud native] briefly talk about the understanding of flume, a massive data collection component
Interpretation of key parameters in MOSFET device manual
剑指 Offer 22. 链表中倒数第k个节点
LSF basic command
System Verilog实现优先级仲裁器
vscode设置删除行快捷键[通俗易懂]
871. 最低加油次数
How to transfer business data with BorgWarner through EDI?
Executive engine module of high performance data warehouse practice based on Impala