当前位置:网站首页>Deep learning (self supervision: CPC V2) -- data efficient image recognition with contractual predictive coding
Deep learning (self supervision: CPC V2) -- data efficient image recognition with contractual predictive coding
2022-07-28 06:09:00 【Food to doubt life】
List of articles
Preface
This article is published in ICML 2020 On .
This article is about CPC v1 Improved , stay ImageNet Upper top-1 The accuracy is from 48.7% Up to the 71.5%.
This article will briefly introduce CPC v2, The experiment part is not summarized here .
Of the paper Figure 1 An interesting figure is given , Here's the picture :
The blue line is for use CPC v1 pretrain One ResNet, Put it in ImageNet On finetune Performance after , The red line is ResNet Directly in ImageNet On training from scratch, Horizontal axis finetune/train Training data used , You can see , With the reduction of training data ,training from scratch The performance degradation of the model is particularly obvious , And when using all the data to train the model ,finetune The performance of the model is also better than traning from scratch The effect is good . This shows that compared with training from scratch Model of , Use self-monitoring pretrain Model of , Use less training data , Similar performance can be achieved , namely The self supervised trained model , When applied to downstream tasks , You may need only a small amount of data to achieve good performance .
CPC v1 Introduce

Above, CPC v2 Model structure of , For the sake of illustration , I put it on CPC v1 In a section .
- The input image will be divided into several overlapping Of patch, X i , j X_{i,j} Xi,j It means the first one i i i That's ok , The first j j j Column patch
- be-all patch Will go through a feature extractor to extract features ( Corresponding to the blue model ), Get a series of eigenvectors Z i , j Z_{i,j} Zi,j
- Will be located at i i i Xing di j j j The eigenvector of the column Z i , j Z_{i,j} Zi,j, And located at i i i Xing di j j j The eigenvector above the column Z u , j Z_{u,j} Zu,j( u < i u<i u<i) concat together , Through a Context network G ϕ G_{\phi} Gϕ( Corresponding to the red model ) Handle , Get one context vector C i , j C_{i,j} Ci,j
- Yes C i , j C_{i,j} Ci,j Apply a linear change , The linear change matrix is W k W_k Wk, namely Z ^ i + k , j = W k C i , j \hat Z_{i+k,j}=W_k C_{i,j} Z^i+k,j=WkCi,j, utilize Z ^ i + k , j \hat Z_{i+k,j} Z^i+k,j And Z i + k , j Z_{i+k,j} Zi+k,j Contrast learning , It can be simply understood as using the features of the upper half of an image , Predict the characteristics of the lower half of the image
The loss function of comparative learning is InfoNCE, as follows :
Negative example Z l Z_l Zl From other batch The image block of , Or other image blocks of the same image .
Personal view :CPC v1 The operation of is not difficult to understand , Take people for example , If we understand what a dog looks like , Then we see the top half of the dog in an image , Naturally, it can be associated with the shape of the dog in the lower part of the image . Want to make InfoNCE The loss function goes down , It is necessary to establish the connection between the top half and the bottom half of the dog in the image , These connections may allow the model to understand what a dog looks like , That is, what characteristics dogs have .
CPC v2 Introduce
For self-monitoring ,trick It has a huge impact on performance , This is similar to the previous study continual learning Is not the same .
Compared with CPC v1,CPC v2 Introduced more trick, To be specific
- Use a larger model ,CPC v1 Use only the ResNet-101 Top three in residual stack,CPC v2 Deepen the model to ResNet-161(ImageNet top-1 Accuracy improved 5%), At the same time, the resolution of the input image block is improved ( from 60x60 Turn into 80x80,ImageNet top-1 Accuracy improved 2%).
- because CPC v1 The prediction of is only related to several patch of , and BN Others will be introduced patch Information about , Similar to image generation ,BN The algorithm will damage CPC v1 Performance of , The author uses layer normalization to replace BN,ImageNet top-1 Accuracy improved 2%.
- Because large models are easier to over fit , The author raised the difficulty of self-monitoring task , Predict a patch,CPC v2 Used up, down, left and right feature vector, and CPC v1 Only the top feature vector. because CPC v2 Contact more semantic information , Extract and below patch The difficulty of related semantic information will also increase .ImageNet top-1 Accuracy improved 2.5%.
- Enhance with better data , First, take out randomly rgb Two of the three channels ,ImageNet top-1 Accuracy improved 3%, Then apply some geometry 、 Color 、 Elastic deformation and other data enhancement ,ImageNet top-1 Accuracy improved 4.5%, It can be seen that data enhancement has a great impact on self-monitoring .
Above trick Yes CPC v1 The impact of is shown in the figure below 
experiment
Don't make too much summary in the experiment , Here are some interesting parts .
ResNet200 With supervision pretrain Model , Then connect the linear classifier finetune,ResNet33 use CPC v2 pretrain Model , Then connect the linear classifier finetune( At this time, the feature extractor will also finetune, Instead of freezing ).
As can be seen from the table above , Use CPC v2 pretrain well ResNet33, When the amount of data is small , Performance ratio ResNet200 It is better to , Even with all training data , The effect is still better , And notice ResNet33 The model capacity is not as good as ResNet200 Of . You can see , Self supervision has great potential .
边栏推荐
- Two methods of covering duplicate records in tables in MySQL
- Bert based data preprocessing in NLP
- 【3】 Redis features and functions
- XML parsing entity tool class
- Sales notice: on July 22, the "great heat" will be sold, and the [traditional national wind 24 solar terms] will be sold in summer.
- Regular verification rules of wechat applet mobile number
- 【1】 Introduction to redis
- mysql5.6(根据.ibd,.frm文件)恢复单表数据
- Linux(centOs7) 下安装redis
- 深度学习——Patches Are All You Need
猜你喜欢

Use Python to encapsulate a tool class that sends mail regularly

Small program development solves the anxiety of retail industry

What is the detail of the applet development process?

Distributed cluster architecture scenario optimization solution: session sharing problem

Linux(centOs7) 下安装redis

Installing redis under Linux (centos7)

Sales notice: on July 22, the "great heat" will be sold, and the [traditional national wind 24 solar terms] will be sold in summer.

How digital library realizes Web3.0 social networking

强化学习——价值学习中的SARSA

深度学习(自监督:MoCo V3):An Empirical Study of Training Self-Supervised Vision Transformers
随机推荐
No module named yum
微信小程序开发详细步骤是什么?
小程序开发哪家更靠谱呢?
Nlp项目实战自定义模板框架
使用神经网络实现对天气的预测
【二】redis基础命令与使用场景
Digital collections strengthen reality with emptiness, enabling the development of the real economy
What are the advantages of small program development system? Why choose it?
word2vec和bert的基本使用方法
深度学习(自监督:SimCLR)——A Simple Framework for Contrastive Learning of Visual Representations
At the moment of the epidemic, online and offline travelers are trapped. Can the digital collection be released?
小程序开发要多少钱?两种开发方法分析!
Digital collections "chaos", 100 billion market change is coming?
On July 7, the national wind 24 solar terms "Xiaoshu" came!! Attachment.. cooperation.. completion.. advance.. report
Mysql的两种覆盖表中重复记录的方法
Sorting and paging, multi table query after class exercise
Notice of attack: [bean Bingbing] send, sell, cash, draw, prize, etc
Applet development
Sqlalchemy usage related
小程序开发流程详细是什么呢?