当前位置:网站首页>Chapter 3 of hands on deep learning - (1) linear regression is realized from scratch_ Learning thinking and exercise answers
Chapter 3 of hands on deep learning - (1) linear regression is realized from scratch_ Learning thinking and exercise answers
2022-07-02 17:14:00 【coder_ sure】
List of articles
3.1 Linear regression
author github link : github link
Learning notes
exercises
- If we initialize the weight to zero , What's going to happen . Is the algorithm still valid ?
- Suppose you are George · Simon · ohm , This paper attempts to establish a model for the relationship between voltage and current . Can you use automatic differentiation to learn the parameters of the model ?
- You can be based on Planck's law Use spectral energy density to determine the temperature of an object ?
- If you want to calculate the second derivative, what problems may you encounter ? How would you solve these problems ?
- Why is it
squared_loss
You need to use... In the functionreshape
function ? - Try using different learning rates , Observe how fast the value of the loss function decreases .
- If the number of samples cannot be divided by the batch size ,
data_iter
What happens to the behavior of functions ?
Problem solving
1. If we initialize the weight to zero , What's going to happen . Is the algorithm still valid ?
Experiments show that the algorithm is still effective , It looks better
whole 0 Initialization is also a common choice , Compared with normal distribution initialization, it may move towards different local optima , The algorithm is still effective .
# w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)
w = torch.zeros((2,1) ,requires_grad=True)
epoch 1, loss 0.036967
epoch 2, loss 0.000132
epoch 3, loss 0.000050
4. If you want to calculate the second derivative, what problems may you encounter ? How would you solve these problems ?
The calculation formula of first-order derivative function cannot be obtained directly . resolvent : Find the first derivative and save the calculation diagram .
Example : y = x 3 + c o s x , x = π 2 , π y=x^3+cosx,x=\frac{\pi}{2},\pi y=x3+cosx,x=2π,π, Find the first derivative and the second derivative respectively
Reference resources
import torch
import math
import numpy as np
x = torch.tensor([math.pi / 2, math.pi], requires_grad=True)
y = x ** 3 + torch.cos(x)
true_dy = 3 * x ** 2 - torch.sin(x)
true_d2y = 6 * x - torch.cos(x)
# Find the first derivative , After saving the calculation diagram , To find the second derivative
dy = torch.autograd.grad(y, x,
grad_outputs=torch.ones(x.shape),
create_graph=True,
retain_graph=True) # Keep the calculation diagram for calculating the second derivative
# After the tensor, add .detach().numpy() Only tensor values can be output
print(" First derivative true value :{} \n First derivative calculation value :{}".format(true_dy.detach().numpy(), dy[0].detach().numpy()))
# Find the second derivative . above dy The first element of is the first derivative
d2y = torch.autograd.grad(dy, x,
grad_outputs=torch.ones(x.shape),
create_graph=False # No more calculation charts , Destroy the previous calculation diagram
)
print("\n Second order conduction true value :{} \n Second derivative calculation value :{}".format(true_d2y.detach().numpy(), d2y[0].detach().numpy()))
5. Why is it squared_loss
You need to use... In the function reshape
function ?
y ^ \hat{y} y^ It's a column vector , y y y It's a row vector
6. Try using different learning rates , Observe how fast the value of the loss function decreases .
Try it on your own :
- Low learning rate loss The decline is relatively slow
- Excessive learning rate loss Unable to converge
7. If the number of samples cannot be divided by the batch size ,data_iter
What happens to the behavior of functions ?
The number of samples left at the end of execution cannot be divided , Will report a mistake
边栏推荐
- [error record] error -32000 received from application: there are no running service protocol
- 一文看懂:数据指标体系的4大类型
- 深度之眼(三)——矩阵的行列式
- 远程办公对我们的各方面影响心得 | 社区征文
- What is generics- Introduction to generics
- 剑指 Offer 27. 二叉树的镜像
- pwm呼吸燈
- In MySQL and Oracle, the boundary and range of between and precautions when querying the date
- 宝宝巴士创业板IPO被终止:曾拟募资18亿 唐光宇控制47%股权
- 2322. Remove the minimum fraction of edges from the tree (XOR and & Simulation)
猜你喜欢
The poor family once again gave birth to a noble son: Jiangxi poor county got the provincial number one, what did you do right?
上传代码到远程仓库报错error: remote origin already exists.
数字IC手撕代码--投票表决器
871. 最低加油次数
配置基于接口的ARP表项限制和端口安全(限制用户私自接入傻瓜交换机或非法主机接入)
易语言abcd排序
Cell: Tsinghua Chenggong group revealed an odor of skin flora. Volatiles promote flavivirus to infect the host and attract mosquitoes
智能垃圾桶(五)——点亮OLED
Weili holdings listed on the Hong Kong Stock Exchange: with a market value of HK $500million, it contributed an IPO to Hubei
DGraph: 大规模动态图数据集
随机推荐
R and rstudio download and installation tutorial (super detailed)
jsp 和 servlet 有什么区别?
Sword finger offer 24 Reverse linked list
什么是泛型?- 泛型入门篇
Configure MySQL under Linux to authorize a user to access remotely, which is not restricted by IP
Talk about an experience of job hopping and being rejected
2322. Remove the minimum fraction of edges from the tree (XOR and & Simulation)
一文看懂:数据指标体系的4大类型
john爆破出现Using default input encoding: UTF-8 Loaded 1 password hash (bcrypt [Blowfish 32/64 X3])
Blog theme "text" summer fresh Special Edition
Seven charts, learn to do valuable business analysis
Domestic relatively good OJ platform [easy to understand]
DigiCert SSL证书支持中文域名申请吗?
一年顶十年
MySQL port
关于举办科技期刊青年编辑沙龙——新时代青年编辑应具备的能力及提升策略的通知...
Easy language ABCD sort
人生的开始
uboot的作用和功能
pwm呼吸燈