当前位置:网站首页>Paper recommendation: relicv2, can the new self supervised learning surpass supervised learning on RESNET?
Paper recommendation: relicv2, can the new self supervised learning surpass supervised learning on RESNET?
2022-06-11 04:55:00 【Shenlan Shenyan AI】
Self supervision ResNets Can it be in ImageNet Go beyond supervised learning without labels ?
This paper will introduce a recent paper promoting the development of self-regulated learning , The paper was written by DeepMind publish , Nicknamed ReLICv2.
Tomasev Papers of others “Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?”. Put forward the right ReLIC The technical improvement of the paper , The paper is called “Representation learning via invariant causal mechanisms”. The core of their method is to add Kullback-Leibler-Divergence Loss , This is calculated by using the probability formula of classical comparative learning objectives . In addition, a novel enhancement scheme is introduced , And learn from the experience of other relevant papers .
Keep this article as simple as possible , So that even readers without prior knowledge can follow up .

01 Self supervised and unsupervised pre training of computer vision
Before delving into the paper , It is necessary to quickly review the whole content of self-monitoring pre training . If you know something about self supervised learning , Or be familiar with self-monitoring pre training , You can skip this part .
Generally, computer vision models have been trained by supervised learning . This means that humans view images and create various labels for them , The model can learn the patterns of these tags . for example , Human annotators assign class labels to images or draw bounding boxes around objects in images . But anyone who has been in contact with the tag task knows , Creating enough training data sets is a lot of work .
by comparison , Self supervised learning does not require any manually created tags , The model supervises its own learning . In computer vision , The most common way to model this self-monitoring is to cut the image differently or apply different enhancements to it , And pass the modified input to the model . In this way, even if the image contains the same visual information, it will look different , That is, let the model know that these images still contain the same visual information , That is, the same object , This allows the model to learn similar potential representations of the same object ( Output vector ).
Then, we can carry out transfer learning on this pre training model . These models will be in 10% Training on tagged data , To perform downstream tasks such as target detection and semantic segmentation .
02 Contribution of the paper
As is the case with many other self supervised pre training techniques ,ReLICv2 The first step in the training process is also about data enhancement . In the paper , The author first mentioned the use of previously successful enhancement schemes .
The first is SwAV Enhancements used in . Contrary to previous work ,SwAV Not only create two different input image clipping , And it can be cut at most 6 Time . These can be made in different sizes , for example 224x244 and 96x96, The most successful numbers are two large sizes and 6 A small size . If you want to know more about SwAV Information about the enhancement scheme , Please read the original paper .
The second set of enhancements previously described comes from SimCLR. This scheme is now used by almost all papers in this field . By applying a random horizontal flip 、 Color distortion 、 Gaussian blur and overexposure to process images . If you want to know about SimCLR For more information , Please read the original paper .
however ReLICv2 A novel enhancement technology is also provided : Remove the background from the object in the image . To achieve this , They are in some... In an unsupervised way ImageNet Train a background removal model on the data . The authors found this enhancement in 10% The most effective application of probability .

Once the image is enhanced and cropped many times , The output will pass through the encoder network and the target network . When the encoder network updates using back propagation , The target network passes through a network similar to MoCo The momentum calculation of the frame receives updates .
ReLICv2 The overall goal of is to learn encoder , To generate a consistent output vector for the same class . The author formulates a novel loss function . They start with the standard contrast negative log likelihood , Its core has similarity function , Anchor image ( Mainly input image ) With the positive example ( An enhanced version of the image ) And negative examples ( Other images in the same image ) Compare .

ReLICv2 The loss function consists of negative log likelihood and the sum of anchor view and positive view Kullback-Leibler Divergence composition .
This loss is extended by comparing the probability formula of the target : The relationship between the possibility of anchor image and the possibility of positive image Kullback-Leibler The divergence . This forces the network to learn similar images not too close , Dissimilar images can be far away , Avoid extreme clustering that can lead to learning collapse , And create a more balanced distribution between clusters . So this additional loss item can be seen as similar to a self-monitoring model . For this loss function, we include alpha and beta Two hyperparameters , The two loss items can be weighted separately .
The addition of all these methods has proved successful , Let's take a closer look at the results presented in the paper .
03 Result display
As the title of the paper states ,ReLICv2 The point of trying to prove is , The self supervised pre training method is comparable only when the encoder networks all use the same network architecture . For their work , Choose to use the classic ResNet-50.

stay ImageNet Use different pre training ResNet-50 Result .
When using the same ResNet-50 And in ImageNet-1K When training its linear layer on while freezing all other weights ,ReLICv2 It has considerable advantages over existing methods . With primordial ReLIC Comparison of papers , The introduced improvements even bring performance advantages .

Compared with the supervised pre training model on different data sets , Accuracy has improved .
When comparing migration learning performance on other data sets ,ReLICv2 And other methods ( Such as NNCLR and BYOL) comparison , Continue to show impressive performance . This further shows that ReLICv2 It's a new one 、 Advanced self-monitoring pre training methods . The evaluation of other data sets is not often mentioned in other papers .

ReLICv2 and BYOL Visualization of learning clusters . Point bluer , The closer to the corresponding class cluster .
This chart shows ReLICv2 The analogy of learning other frameworks ( Such as BYOL) Closer to the . This again shows that compared with other methods , These techniques make it possible to create finer grained clusters .
04 summary
In this article, I introduce ReLICv2, This is a new self supervised pre training method and shows very good experimental results .
By combining the probability formula of comparative learning objectives , And by adding a proven novel enhancement scheme , This technology can push forward the space of visual self-monitoring pre training .
I hope this article can make you right ReLICv2 Have a good preliminary understanding , But there's still a lot to discover . Therefore, it is recommended to read the original paper , Even if you are new to the field . You have to start somewhere ;) I hope you like the explanation of this paper . If you have any comments on the article or find any mistakes , Please feel free to comment .
reference
[1] Mitrovic, Jovana, et al. “Representation learning via invariant causal mechanisms.” arXiv preprint arXiv:2010.07922 (2020).
[2] Tomasev, Nenad, et al. “Pushing the limits of self-supervised ResNets: Can we outperform supervised learning without labels on ImageNet?.” arXiv preprint arXiv:2201.05119 (2022).
The author of this article :Leon Sick
| About Deep extension technology |
Shenyan technology was founded in 2018 year 1 month , Zhongguancun High tech enterprise , It is an enterprise with the world's leading artificial intelligence technology AI Service experts . In computer vision 、 Based on the core technology of natural language processing and data mining , The company launched four platform products —— Deep extension intelligent data annotation platform 、 Deep extension AI Development platform 、 Deep extension automatic machine learning platform 、 Deep extension AI Open platform , Provide data processing for enterprises 、 Model building and training 、 Privacy computing 、 One stop shop for Industry algorithms and solutions AI Platform services .
边栏推荐
- Let me tell you how to choose a 10 Gigabit network card
- World programming language ranking in January 2022
- 博途仿真时出现“没有针对此地址组态任何硬件,无法进行修改”解决办法
- Top 100 video information of station B
- Sorting out relevant programming contents of renderfeature of unity's URP
- How to quickly find the official routine of STM32 Series MCU
- The central rural work conference has released important signals. Ten ways for AI technology to help agriculture can be expected in the future
- [aaai 2021 timing action nomination generation] detailed interpretation of bsn++ long article
- [Transformer]CoAtNet:Marrying Convolution and Attention for All Data Sizes
- Huawei equipment configuration MCE
猜你喜欢

华为设备配置BGP/MPLS IP 虚拟专用网

Analysis of 17 questions in Volume 1 of the new college entrance examination in 2022

Feature engineering feature dimension reduction
![[Transformer]MViTv1:Multiscale Vision Transformers](/img/de/1f2751f08dbf4ea3f77a403365dbda.jpg)
[Transformer]MViTv1:Multiscale Vision Transformers

Using keras to build the basic model yingtailing flower

World programming language ranking in January 2022

Lianrui: how to rationally see the independent R & D of domestic CPU and the development of domestic hardware

四大MQ的区别

Huawei equipment is configured to access the virtual private network through GRE tunnel

Paper reproduction: pare
随机推荐
Use pathlib instead of OS and os Common methods of path
[Transformer]On the Integration of Self-Attention and Convolution
一大厂95后程序员对部门领导不满,删库跑路被判刑
Check the digital tube with a multimeter
What is the difference between a wired network card and a wireless network card?
Writing a good research title: Tips & Things to avoid
Cartographer learning record: cartographer Map 3D visualization configuration (self recording dataset version)
Yolact paper reading and analysis
Deep extension technology: intelligent OCR recognition technology based on deep learning has great potential
C language test question 3 (program multiple choice question - including detailed explanation of knowledge points)
What is the difference between gigabit network card and 10 Gigabit network card?
World programming language ranking in January 2022
New product release: Lianrui launched a dual port 10 Gigabit bypass network card
SAVING AND LOADING A GENERAL CHECKPOINT IN PYTORCH
力扣(LeetCode)161. 相隔为 1 的编辑距离(2022.06.10)
Support vector machine -svm+ source code
Anaconda installation and use process
[aaai 2021 timing action nomination generation] detailed interpretation of bsn++ long article
How to quickly find the official routine of STM32 Series MCU
华为设备配置BGP/MPLS IP 虚拟专用网