当前位置:网站首页>Chapter 3 of hands on deep learning - (1) linear regression is realized from scratch_ Learning thinking and exercise answers
Chapter 3 of hands on deep learning - (1) linear regression is realized from scratch_ Learning thinking and exercise answers
2022-07-02 17:14:00 【coder_ sure】
List of articles
3.1 Linear regression
author github link : github link
Learning notes
exercises
- If we initialize the weight to zero , What's going to happen . Is the algorithm still valid ?
- Suppose you are George · Simon · ohm , This paper attempts to establish a model for the relationship between voltage and current . Can you use automatic differentiation to learn the parameters of the model ?
- You can be based on Planck's law Use spectral energy density to determine the temperature of an object ?
- If you want to calculate the second derivative, what problems may you encounter ? How would you solve these problems ?
- Why is it
squared_loss
You need to use... In the functionreshape
function ? - Try using different learning rates , Observe how fast the value of the loss function decreases .
- If the number of samples cannot be divided by the batch size ,
data_iter
What happens to the behavior of functions ?
Problem solving
1. If we initialize the weight to zero , What's going to happen . Is the algorithm still valid ?
Experiments show that the algorithm is still effective , It looks better
whole 0 Initialization is also a common choice , Compared with normal distribution initialization, it may move towards different local optima , The algorithm is still effective .
# w = torch.normal(0, 0.01, size=(2,1), requires_grad=True)
w = torch.zeros((2,1) ,requires_grad=True)
epoch 1, loss 0.036967
epoch 2, loss 0.000132
epoch 3, loss 0.000050
4. If you want to calculate the second derivative, what problems may you encounter ? How would you solve these problems ?
The calculation formula of first-order derivative function cannot be obtained directly . resolvent : Find the first derivative and save the calculation diagram .
Example : y = x 3 + c o s x , x = π 2 , π y=x^3+cosx,x=\frac{\pi}{2},\pi y=x3+cosx,x=2π,π, Find the first derivative and the second derivative respectively
Reference resources
import torch
import math
import numpy as np
x = torch.tensor([math.pi / 2, math.pi], requires_grad=True)
y = x ** 3 + torch.cos(x)
true_dy = 3 * x ** 2 - torch.sin(x)
true_d2y = 6 * x - torch.cos(x)
# Find the first derivative , After saving the calculation diagram , To find the second derivative
dy = torch.autograd.grad(y, x,
grad_outputs=torch.ones(x.shape),
create_graph=True,
retain_graph=True) # Keep the calculation diagram for calculating the second derivative
# After the tensor, add .detach().numpy() Only tensor values can be output
print(" First derivative true value :{} \n First derivative calculation value :{}".format(true_dy.detach().numpy(), dy[0].detach().numpy()))
# Find the second derivative . above dy The first element of is the first derivative
d2y = torch.autograd.grad(dy, x,
grad_outputs=torch.ones(x.shape),
create_graph=False # No more calculation charts , Destroy the previous calculation diagram
)
print("\n Second order conduction true value :{} \n Second derivative calculation value :{}".format(true_d2y.detach().numpy(), d2y[0].detach().numpy()))
5. Why is it squared_loss
You need to use... In the function reshape
function ?
y ^ \hat{y} y^ It's a column vector , y y y It's a row vector
6. Try using different learning rates , Observe how fast the value of the loss function decreases .
Try it on your own :
- Low learning rate loss The decline is relatively slow
- Excessive learning rate loss Unable to converge
7. If the number of samples cannot be divided by the batch size ,data_iter
What happens to the behavior of functions ?
The number of samples left at the end of execution cannot be divided , Will report a mistake
边栏推荐
- Atcoder beginer contest 169 (B, C, D unique decomposition, e mathematical analysis f (DP))
- linux安装postgresql + patroni 集群问题
- What is the difference between JSP and servlet?
- 绿竹生物冲刺港股:年期内亏损超5亿 泰格医药与北京亦庄是股东
- 入行数字IC验证后会做些什么?
- C语言自定义函数的方法
- One year is worth ten years
- GeoServer:发布PostGIS数据源
- 什么是泛型?- 泛型入门篇
- ETH数据集下载及相关问题
猜你喜欢
剑指 Offer 25. 合并两个排序的链表
AP and F107 data sources and processing
[leetcode] 14. Préfixe public le plus long
【Leetcode】14. Longest Common Prefix
John blasting appears using default input encoding: UTF-8 loaded 1 password hash (bcrypt [blowfish 32/64 x3])
小鹏P7雨天出事故安全气囊没有弹出 官方回应:撞击力度未达到弹出要求
你想要的宏基因组-微生物组知识全在这(2022.7)
linux下配置Mysql授权某个用户远程访问,不受ip限制
【Leetcode】13. Roman numeral to integer
亚马逊云科技 Community Builder 申请窗口开启
随机推荐
什么是泛型?- 泛型入门篇
What will you do after digital IC Verification?
[essay solicitation activity] Dear developer, RT thread community calls you to contribute
Configure ARP table entry restrictions and port security based on the interface (restrict users' private access to fool switches or illegal host access)
小鹏P7雨天出事故安全气囊没有弹出 官方回应:撞击力度未达到弹出要求
go-zero微服务实战系列(八、如何处理每秒上万次的下单请求)
Detailed explanation of @accessories annotation of Lombok plug-in
C语言中sprintf()函数的用法
ThreadLocal
体验居家办公完成项目有感 | 社区征文
剑指 Offer 22. 链表中倒数第k个节点
Day 18 of leetcode dynamic planning introduction
綠竹生物沖刺港股:年期內虧損超5億 泰格醫藥與北京亦莊是股東
人生的开始
Just a coincidence? The mysterious technology of apple ios16 is even consistent with the products of Chinese enterprises five years ago!
Qstype implementation of self drawing interface project practice (II)
Serial port controls steering gear rotation
基于多元时间序列对高考预测分析案例
A few lines of code to complete RPC service registration and discovery
一年頂十年