当前位置:网站首页>The model defined (modified) in pytoch loads some required pre training model parameters and freezes them
The model defined (modified) in pytoch loads some required pre training model parameters and freezes them
2022-06-26 05:34:00 【Little beaver flower made by Rua】
Part of this article refers to https://zhuanlan.zhihu.com/p/34147880
One . This method is more versatile , Load the parameters of the pre training model according to the parameters of your own model , Assignment with the same name . If you add some layers to the original model, it will not be loaded
dict_trained=torch.load(self.args.load_path, map_location=torch.device('cpu'))
dict_new=model.state_dict()
# 1. filter out unnecessary keys
dict_trained = {
k: v for k, v in dict_trained.items() if k in dict_new}
# 2. overwrite entries in the existing state dict
model_dict.update(dict_trained)
model.load_state_dict(dict_new)
Two . This is a lot more complicated , Make the changes you want , Such as my , This model adds four layers ’dense’, ‘unary_affine’, ‘binary_affine’, ‘classifier’, adopt j+=8, Skip their weight and bias, This can be referred to as weight attenuation . At the same time, the original model parameters are ’crf’ Partially not loaded .
dict_trained = torch.load(self.args.load_path, map_location=torch.device('cpu'))
dict_new = self.model.state_dict().copy()
trained_list = list(dict_trained.keys())
new_list = list(dict_new.keys())
j = 0
no_loda = {'dense', 'unary_affine', 'binary_affine', 'classifier'}
for i in range(len(trained_list)):
flag = False
if 'crf' in trained_list[i]:
continue
for nd in no_loda:
if nd in new_list[j] and 'bert' not in new_list[j]:
flag = True
if flag:
j += 8 # no_loda Of dense and bias Pass by
else:
dict_new[new_list[j]] = dict_trained[trained_list[i]]
if new_list[j] != trained_list[i]:
print("i:{},new_state_dict: {} trained state_dict: {} atypism ".format(i, new_list[j], trained_list[i]))
j += 1 #keys Not aligned
model.load_state_dict(dict_new)
Later, I learned that there is a kind of It's simpler Methods :
When you set up your own model , If you only want to use the parameters at the same structure of the pre training model , That is to say, when loading, set the parameter strict Set to False that will do . The default value of this parameter is True, The layer representing the pre training model is strictly equivalent to the network structure layer defined by itself ( Such as layer name and dimension ), Otherwise, we can't load , The implementation is as follows :
model.load_state_dict(torch.load(self.args.load_path, strict=False))
PS: Encountered a mistake , You may wish to modify the model parameters keys And loading model parameters keys Print it out , An antidote against the disease
3、 ... and . Freeze these layers of parameters
In a nutshell
for k in model.paramers:
k.requires_grad=False
There are many ways , The freezing method corresponding to the above method is used here
I suggest you take a look at
https://discuss.pytorch.org/t/how-the-pytorch-freeze-network-in-some-layers-only-the-rest-of-the-training/7088
perhaps
https://discuss.pytorch.org/t/correct-way-to-freeze-layers/26714
perhaps
Corresponding , In training ,optimizer It can only be updated requires_grad = True Parameters of , therefore
optimizer = torch.optim.Adam( filter(lambda p: p.requires_grad, net.parameters(),lr) )
边栏推荐
- data = self._data_queue.get(timeout=timeout)
- Security problems in wireless networks and modern solutions
- Mise en file d'attente des messages en utilisant jedis Listening redis stream
- 12 multithreading
- What management systems (Updates) for things like this
- Leetcode114. Expand binary tree into linked list
- Chapter 9 setting up structured logging (I)
- Describe an experiment of Kali ARP in LAN
- Replacing domestic image sources in openwrt for soft routing (take Alibaba cloud as an example)
- Win socket programming (Mengxin initial battle)
猜你喜欢
![[arm] build boa based embedded web server on nuc977](/img/fb/7dc1898e35ed78b417770216b05286.png)
[arm] build boa based embedded web server on nuc977

Ad tutorial series | 4 - creating an integration library file

Could not get unknown property ‘*‘ for SigningConfig container of type org.gradle.api.internal

10 set

Redis discovery bloom filter
![[red team] what preparations should be made to join the red team?](/img/03/f246f18f8925167dbd5e9d63912faa.png)
[red team] what preparations should be made to join the red team?

Replacing domestic image sources in openwrt for soft routing (take Alibaba cloud as an example)

When was the autowiredannotationbeanpostprocessor instantiated?

uniCloud云开发获取小程序用户openid

uni-app吸顶固定样式
随机推荐
LeetCode_二叉搜索树_简单_108.将有序数组转换为二叉搜索树
Yunqi lab recommends experience scenarios this week, free cloud learning
Daily production training report (17)
C# 40. Byte[] to hexadecimal string
skimage.morphology.medial_axis
Leetcode114. 二叉树展开为链表
[MySQL] MySQL million level data paging query method and its optimization
最后一次飞翔
小小面试题之GET和POST的区别
Supplementary course on basic knowledge of IM development (II): how to design a server-side storage architecture for a large number of image files?
Henkel database custom operator '~~‘
劣币驱逐良币的思考
thread priority
The difference between get and post in small interview questions
Consul服务注册与发现
Red team scoring method statistics
Wechat team sharing: technical decryption behind wechat's 100 million daily real-time audio and video chats
Setting pseudo static under fastadmin Apache
[PHP] PHP two-dimensional array is sorted by multiple fields
【MYSQL】MySQL 百万级数据量分页查询方法及其优化