当前位置:网站首页>Transfer learning to freeze the network:
Transfer learning to freeze the network:
2022-08-01 11:05:00 【Wsyoneself】
Description: pytorch (1-3), TensorFlow (4)
fine tune is to freeze the front layers of the network, and then train the last layer
- Pass all parameters to the optimizer, but set the
requires_gradof the parameters of the layer to be frozen toFalse:optimizer = optim.SGD(model.parameters(), lr=1e-2) # All parameters are passed infor name, param in model.named_parameters():if the name of the network layer to be frozen (that is, the value of name):param.requires_grad = False - The optimizer passes in the parameters of the unfrozen network layer:
optimizer = optim.SGD(model.name.parameters() of the unfrozen network layer, lr=1e-2) # The optimizer only passes in the parameters of fc2 - The best practice is: the optimizer only passes in the parameter of requires_grad=True, the memory occupied will be smaller, and the efficiency will be higher.Code and Combining 1 and 2
Save video memory: do not pass parameters that will not be updated to
optimizerIncrease the speed: set the
requires_gradof the parameters that are not updated toFalse, saving the time to calculate the gradient of these parameters
- The code is as follows:
#Define optimization operatoroptimizer = tf.train.AdamOptimizer(1e-3)#Select the parameters to be optimizedoutput_vars = tf.get_collection(tf.GraphKyes.TRAINABLE_VARIABLES, scope= 'outpt')train_step = optimizer.minimize(loss_score,var_list = output_vars)
Put the layer that needs to update the gradient in the tf.get_collection function, and do not put it in if it does not need to be updated.- The main function of the function: from a collectionretrieve variable
- is used to get all the elements in the key collection and return a list.The order of the list is based on the order in which the variables were placed in the set.scope is an optional parameter, indicating the namespace (namedomain), if specified, it will return a list of all variables put into 'key' in the name domain (for example, the outpt description in the sample code is the parameter to return the outpt layer), if not specified, it will return all variables.
边栏推荐
- 石头科技打造硬核品牌力 持续出海拓展全球市场
- Why Metropolis–Hastings Works
- 小程序毕设作品之微信美食菜谱小程序毕业设计成品(4)开题报告
- Android Security and Protection Policy
- Promise learning (1) What is Promise?how to use?How to solve callback hell?
- Stone Technology builds hard-core brand power and continues to expand the global market
- 一篇文章,带你详细了解华为认证体系证书(1)
- STM32 Personal Notes - Embedded C Language Optimization
- Promise学习(四)异步编程的终极解决方案async + await:用同步的方式去写异步代码
- 力扣解法汇总1374-生成每种字符都是奇数个的字符串
猜你喜欢

小程序毕设作品之微信美食菜谱小程序毕业设计成品(1)开发概要

Promise learning (2) An article takes you to quickly understand the common APIs in Promise

Android Security and Protection Policy

如何解决 chrome 浏览器标签过多无法查看到标题的情况

Qt 支持HEIC/HEIF格式图片

redis6 跟着b站尚硅谷学习

Drawing arrows of WPF screenshot control (5) "Imitation WeChat"

CTFshow,命令执行:web34、35、36

mysql进阶(二十二)MySQL错误之Incorrect string value中文字符输入错误问题分析

xss-labs靶场挑战
随机推荐
Glassmorphism design style
Introduction to data warehouse layering (real-time data warehouse architecture)
gc的意义和触发条件
【钛晨报】国家统计局:7月制造业PMI为49%;玖富旗下理财产品涉嫌欺诈,涉及390亿元;国内航线机票燃油附加费8月5日0时起下调
Dapr 与 NestJs ,实战编写一个 Pub & Sub 装饰器
回归预测 | MATLAB实现RNN循环神经网络多输入单输出数据预测
重庆市大力实施智能建造,推动建筑业数字化转型,助力“建造强市”
怎么找出电脑隐藏的软件(如何清理电脑隐藏软件)
activiti工作流的分页查询避坑
CTFshow,命令执行:web32
pve 删除虚拟机「建议收藏」
jmeter
昇思大模型体验平台初体验——以小模型LeNet为例
开天aPaaS之移动手机号码空号检测【开天aPaaS大作战】
回归预测 | MATLAB实现TPA-LSTM(时间注意力注意力机制长短期记忆神经网络)多输入单输出
EasyRecovery热门免费数据检测修复软件
千万级乘客排队系统重构&压测方案——总结篇
Promise学习(一)Promise是什么?怎么用?回调地狱怎么解决?
C#/VB.NET convert PPT or PPTX to image
[Cloud Residency Co-Creation] Huawei Cloud Global Scheduling Technology and Practice of Distributed Technology