当前位置:网站首页>Transfer learning to freeze the network:
Transfer learning to freeze the network:
2022-08-01 11:05:00 【Wsyoneself】
Description: pytorch (1-3), TensorFlow (4)
fine tune is to freeze the front layers of the network, and then train the last layer
- Pass all parameters to the optimizer, but set the
requires_grad
of the parameters of the layer to be frozen toFalse:
optimizer = optim.SGD(model.parameters(), lr=1e-2) # All parameters are passed infor name, param in model.named_parameters():if the name of the network layer to be frozen (that is, the value of name):param.requires_grad = False
- The optimizer passes in the parameters of the unfrozen network layer:
optimizer = optim.SGD(model.name.parameters() of the unfrozen network layer, lr=1e-2) # The optimizer only passes in the parameters of fc2
- The best practice is: the optimizer only passes in the parameter of requires_grad=True, the memory occupied will be smaller, and the efficiency will be higher.Code and Combining 1 and 2
Save video memory: do not pass parameters that will not be updated to
optimizer
Increase the speed: set the
requires_grad
of the parameters that are not updated toFalse
, saving the time to calculate the gradient of these parameters
- The code is as follows:
#Define optimization operatoroptimizer = tf.train.AdamOptimizer(1e-3)#Select the parameters to be optimizedoutput_vars = tf.get_collection(tf.GraphKyes.TRAINABLE_VARIABLES, scope= 'outpt')train_step = optimizer.minimize(loss_score,var_list = output_vars)
Put the layer that needs to update the gradient in the tf.get_collection function, and do not put it in if it does not need to be updated.- The main function of the function: from a collectionretrieve variable
- is used to get all the elements in the key collection and return a list.The order of the list is based on the order in which the variables were placed in the set.scope is an optional parameter, indicating the namespace (namedomain), if specified, it will return a list of all variables put into 'key' in the name domain (for example, the outpt description in the sample code is the parameter to return the outpt layer), if not specified, it will return all variables.
边栏推荐
- How I secured 70,000 ETH and won a 6 million bug bounty
- 【随心笔记】假期快过去了,都干了点什么
- Promise学习(四)异步编程的终极解决方案async + await:用同步的方式去写异步代码
- WPF 截图控件之绘制箭头(五)「仿微信」
- JS数据类型转换完全攻略
- Small application project works WeChat gourmet recipes applet graduation design of finished product (1) the development profile
- 2022年7月31日--使用C#迈出第一步--使用C#中的数组和foreach语句来存储和循环访问数据序列
- RK3399 platform development series on introduction to (kernel) 1.52, printk function analysis - the function call will be closed
- For small applications, which database is better to use?
- NIO‘s Sword(思维,取模,推公式)
猜你喜欢
大众碰到点评的一个字体反爬,落地技术也是绝了
Mysql index related knowledge review one
各位大拿,安装Solaris 11.4操作系统,在安装数据库依赖包的时候包这个错,目前无原厂支持,也无安装盘,联网下载后报这个错,请教怎么解决?
CTFshow,命令执行:web34、35、36
WPF 截图控件之绘制箭头(五)「仿微信」
复现assert和eval成功连接或失败连接蚁剑的原因
Promise learning (2) An article takes you to quickly understand the common APIs in Promise
2022年中盘点 | 产品打底,科技背书,广汽集团阔步向前
Online - GCeasy GC log analysis tools
爱可可AI前沿推介(8.1)
随机推荐
July 31, 2022 -- Take your first steps with C# -- Use arrays and foreach statements in C# to store and iterate through sequences of data
监视网络连接的ss命令
在线GC日志分析工具——GCeasy
Basic configuration commands of cisco switches (what is the save command of Huawei switches)
WPF 截图控件之绘制箭头(五)「仿微信」
新一代超安全蜂窝电池, 思皓爱跑上市13.99万元起售
轮询和长轮询的区别
CTFshow,命令执行:web34、35、36
小程序毕设作品之微信美食菜谱小程序毕业设计成品(1)开发概要
回归预测 | MATLAB实现TPA-LSTM(时间注意力注意力机制长短期记忆神经网络)多输入单输出
冰冰学习笔记:gcc、gdb等工具的使用
【云驻共创】分布式技术之华为云全域调度技术与实践
.NET深入解析LINQ框架(三:LINQ优雅的前奏)
WPF 截图控件之绘制箭头(五)「仿微信」
Android Security and Protection Policy
这是我见过写得最烂的Controller层代码,没有之一!
数字化转型实践:世界级2B数字化营销的方法框架
从零开始Blazor Server(4)--登录系统
NIO‘s Sword(思维,取模,推公式)
图解MySQL内连接、外连接、左连接、右连接、全连接......太多了