当前位置:网站首页>Transfer learning to freeze the network:
Transfer learning to freeze the network:
2022-08-01 11:05:00 【Wsyoneself】
Description: pytorch (1-3), TensorFlow (4)
fine tune is to freeze the front layers of the network, and then train the last layer
- Pass all parameters to the optimizer, but set the
requires_gradof the parameters of the layer to be frozen toFalse:optimizer = optim.SGD(model.parameters(), lr=1e-2) # All parameters are passed infor name, param in model.named_parameters():if the name of the network layer to be frozen (that is, the value of name):param.requires_grad = False - The optimizer passes in the parameters of the unfrozen network layer:
optimizer = optim.SGD(model.name.parameters() of the unfrozen network layer, lr=1e-2) # The optimizer only passes in the parameters of fc2 - The best practice is: the optimizer only passes in the parameter of requires_grad=True, the memory occupied will be smaller, and the efficiency will be higher.Code and Combining 1 and 2
Save video memory: do not pass parameters that will not be updated to
optimizerIncrease the speed: set the
requires_gradof the parameters that are not updated toFalse, saving the time to calculate the gradient of these parameters
- The code is as follows:
#Define optimization operatoroptimizer = tf.train.AdamOptimizer(1e-3)#Select the parameters to be optimizedoutput_vars = tf.get_collection(tf.GraphKyes.TRAINABLE_VARIABLES, scope= 'outpt')train_step = optimizer.minimize(loss_score,var_list = output_vars)
Put the layer that needs to update the gradient in the tf.get_collection function, and do not put it in if it does not need to be updated.- The main function of the function: from a collectionretrieve variable
- is used to get all the elements in the key collection and return a list.The order of the list is based on the order in which the variables were placed in the set.scope is an optional parameter, indicating the namespace (namedomain), if specified, it will return a list of all variables put into 'key' in the name domain (for example, the outpt description in the sample code is the parameter to return the outpt layer), if not specified, it will return all variables.
边栏推荐
- Online - GCeasy GC log analysis tools
- 回归预测 | MATLAB实现RNN循环神经网络多输入单输出数据预测
- RK3399 platform development series on introduction to (kernel) 1.52, printk function analysis - the function call will be closed
- Drawing arrows of WPF screenshot control (5) "Imitation WeChat"
- Guangyu Mingdao was selected into the list of pilot demonstration projects for the development of digital economy industry in Chongqing in 2022
- WPF 截图控件之绘制箭头(五)「仿微信」
- Qt 支持HEIC/HEIF格式图片
- 进制与转换、关键字
- leetcode/子矩阵元素和
- 判断JS数据类型的四种方法
猜你喜欢

进制与转换、关键字

Solve vscode input! Unable to quickly generate skeletons (three methods for the new version of vscode to quickly generate skeletons)

小程序毕设作品之微信美食菜谱小程序毕业设计成品(3)后台功能

WPF 截图控件之绘制箭头(五)「仿微信」

Mini Program Graduation Works WeChat Food Recipes Mini Program Graduation Design Finished Products (4) Opening Report

在线GC日志分析工具——GCeasy

一篇文章,带你详细了解华为认证体系证书(1)

Why Metropolis–Hastings Works

Jenkins安装插件遇到的问题

新书上市 |《谁在掷骰子?》在“不确定性时代”中确定前行
随机推荐
RK3399平台开发系列讲解(内核入门篇)1.52、printk函数分析 - 其函数调用时候会关闭中断
将本地项目推送到远程仓库
slice、splice、split傻傻分不清
小程序毕设作品之微信美食菜谱小程序毕业设计成品(3)后台功能
Dapr 与 NestJs ,实战编写一个 Pub & Sub 装饰器
pve 删除虚拟机「建议收藏」
Google Earth Engine——给影像添加一个属性对于单景的时间序列并返回影像
我是如何保护 70000 ETH 并赢得 600 万漏洞赏金的
小程序毕设作品之微信美食菜谱小程序毕业设计成品(2)小程序功能
pgAdmin 4 v6.12 发布,PostgreSQL 开源图形化管理工具
Why Metropolis–Hastings Works
4种常见的鉴权方式及说明
图解MySQL内连接、外连接、左连接、右连接、全连接......太多了
mysql进阶(二十二)MySQL错误之Incorrect string value中文字符输入错误问题分析
【无标题】
July 31, 2022 -- Take your first steps with C# -- Use C# to create readable code with conventions, spaces, and comments
从零开始Blazor Server(4)--登录系统
The meaning and trigger conditions of gc
回归预测 | MATLAB实现TPA-LSTM(时间注意力注意力机制长短期记忆神经网络)多输入单输出
PowerPC技术与市场杂谈