当前位置:网站首页>Deep learning (self supervision: simpl) -- a simple framework for contractual learning of visual representations
Deep learning (self supervision: simpl) -- a simple framework for contractual learning of visual representations
2022-07-28 06:09:00 【Food to doubt life】
List of articles
Preface
The article is Hinton and Google Published in 2020 ICML A self-monitoring article on .
Code address : https://github.com/google-research/simclr
In fact, I smell it when I read the article , It must be Google Home works , The experimental data are very detailed , It explores some characteristics of comparative learning for us .
This article will SimCLR Make a brief introduction , And simply record the interesting experiments .
SimCLR sketch

The picture above shows SimCLR Model structure of , The specific process is
- For an input image x x x Apply two different data enhancements , Get two pictures x i ~ \tilde{x_i} xi~、 x j ~ \tilde{x_j} xj~
- Input two pictures into one CNN The Internet f ( x ) f(x) f(x) The extracted features , obtain h i h_i hi、 h j h_j hj Two feature vector
- Two feature vector Through a MLP The Internet g ( x ) g(x) g(x) Handle , obtain z i z_i zi、 z j z_j zj
hypothesis batch size The size is N N N, Enhanced by data , You can get 2 N 2N 2N Zhang image ,SimCLR In comparative learning , Positive and negative examples are required .
Right picture x x x Apply two different data enhancements , obtain x i ~ \tilde{x_i} xi~、 x j ~ \tilde{x_j} xj~, after CNN、MLP Obtained after processing z i z_i zi、 z j z_j zj, z i z_i zi And z j z_j zj Form a positive example pair , z i z_i zi And batch size Other images in ( Including the image after data enhancement ) Of feature vector Form a negative example pair , therefore A picture will exist 1 A positive example is right , 2 N − 2 2N-2 2N−2 A negative example is right . The loss function of a picture is 
s i m ( z i , z j ) sim(z_i,z_j) sim(zi,zj) It means to calculate the cosine similarity of two vectors , T T T Is a super parameter , 2 N 2N 2N The sum of the loss functions of images is averaged , Get the final loss function , It's actually going on 2 N − 1 2N-1 2N−1 The classification of
Algorithm pseudocode 
experiment
The experimental part has many valuable parts , This paper explores some trick Yes SimCLR Influence , And some conclusions are given
Unless specifically mentioned , All experimental results in this section are based on SimCLR stay ImageNet1000 Pre train one ResNet-50, next freeze Feature extractor , Connect a linear classifier for training , After the training, the model is ImageNet1000 Accuracy on the test set .
Performance impact of data enhancement

Please refer to English for the meaning of the above figure , Three conclusions can be drawn
- Use a single data enhancement , The effect of comparative learning will be very poor
- random cropping And random color distortion The combination effect is the best
- The influence of data enhancement on comparative learning is very obvious , This is not a good property , Many times we need to make exhaustive trials and errors
Unsupervised contrastive learning benefits (more) from bigger models

The above figure shows the effect of image widening and deepening on the performance of the model ,R18(2x) Express ResNet18 Double the width , Other symbols, and so on .
To observe the above , I have the following conclusions
- When increasing the model capacity , First consider deepening ,ResNet152 Performance and ResNet18 Quite a few , And the parameter quantity does not rise much , Deepening the network is the first choice in practice
- Deep enough , Then consider the width , At this time, the parameter quantity will soar , Maybe the training speed will be much slower , Widening the network is the second best choice in practice
A nonlinear projection head improves the representation quality of the layer before it

The figure above explores z z z The influence of the dimension of on the linear classification performance of the model , z z z See for the meaning of SimCLR Brief section , so z z z The dimension of has little effect on the performance of the model , And nonlinear MLP Performance is better than linear MLP, This is in MoCo v2 It has also been verified in .
SimCLR There are two features that can be used in linear classification , One is the output of the feature extractor h h h, Two is MLP Layer output g ( h ) g(h) g(h)( See SimCLR Brief section ), In linear classification , Use h h h Better than g ( h ) g(h) g(h)( Greater than 10%), Probably because MLP Filter out some useful information
Contrastive learning benefits (more) from larger batch sizes and longer training

There are two conclusions that can be drawn from the above figure , For the comparative learning algorithm using negative examples
- batch size The bigger it is , The better the result. , And significantly improved , But for the comparative learning algorithm that only uses positive examples ( for example BYOL、simsiam),batch size Size does not have such a significant impact on performance
- Training epoch The longer the , The better the result. , This is also true for the comparative learning algorithm that only uses positive examples
边栏推荐
猜你喜欢

Distributed cluster architecture scenario optimization solution: distributed ID solution

深度学习——MetaFormer Is Actually What You Need for Vision

强化学习——连续控制

深度学习(自监督:CPC v2)——Data-Efficient Image Recognition with Contrastive Predictive Coding

pytorch深度学习单卡训练和多卡训练

【二】redis基础命令与使用场景

搭建集群之后崩溃的解决办法

Idempotent component

深度学习(增量学习)——(ICCV)Striking a Balance between Stability and Plasticity for Class-Incremental Learning

Kotlin语言现在怎么不火了?你怎么看?
随机推荐
Mysql5.6 (according to.Ibd,.Frm file) restore single table data
Applet development
Micro service architecture cognition and service governance Eureka
【七】redis缓存与数据库数据一致性
小程序搭建制作流程是怎样的?
强化学习——连续控制
【五】redis主从同步与Redis Sentinel(哨兵)
Distinguish between real-time data, offline data, streaming data and batch data
卷积神经网络
【二】redis基础命令与使用场景
CertPathValidatorException:validity check failed
Mysql的两种覆盖表中重复记录的方法
Sales notice: on July 22, the "great heat" will be sold, and the [traditional national wind 24 solar terms] will be sold in summer.
搭建集群之后崩溃的解决办法
SQLAlchemy使用相关
4个角度教你选小程序开发工具?
ModuleNotFoundError: No module named ‘pip‘
What are the advantages of small program development system? Why choose it?
Manually create a simple RPC (< - < -)
速查表之转MD5