当前位置:网站首页>【Transformer】TransMix: Attend to Mix for Vision Transformers
【Transformer】TransMix: Attend to Mix for Vision Transformers
2022-07-29 06:03:00 【Dull cat】
List of articles

Code :https://github.com/Beckschen/TransMix
One 、 Background and motivation
be based on mix-up Data enhancement method of ViT This structure is very useful , Because this structure is easy to produce over fitting , however , Previous mixup-based The method has a potential a priori , That is, the linear interpolation ratio of the target is the same as the interpolation ratio of the input whole graph . This will lead to mixed image There may be no effective goal in it , But there will still be label.
In order to make up for the problems caused by the above phenomenon , The author puts forward TransMix, Can be based on attention map Come on label To deal with .
Two 、 Method
2.1 Mixup
Raw input :
Mixup Use a pair of images x A x_A xA and x B x_B xB, And the corresponding label y A y_A yA and y B y_B yB As input
Input and truth processing :
Use the above two images to get false training samples λ x A + ( 1 − λ ) x B \lambda x_A + (1-\lambda)x_B λxA+(1−λ)xB, And truth value λ y A + ( 1 − λ ) y B \lambda y_A + (1-\lambda)y_B λyA+(1−λ)yB, here λ ∈ [ 0 , 1 ] \lambda \in [0,1] λ∈[0,1] It is a slave. Beta The random number obtained by distribution .
Pictured 1 Shown , There is no way for background pixels to match the foreground label Play the same role , That is, not all pixel pairs label Their contribution is the same .
So this article focuses on how to use learnable methods to achieve input and label Unity of space .
The author found ,vision transformer Produced attention map It can be better used in this task .
Pictured 1 Shown , Author use attention map As λ \lambda λ Value ,label It can be re-weighted, The weight of each pixel is different , Therefore, all pixels in the image will not be combined with the same value . And because of the use of attention map, So this method can be applied to any ViT-based Methods , And there are no additional parameters .

2.2 TransMix
CutMix Data to enhance :
CutMix Is a simple way to enhance , Put two label Combine , Create a new label:
- M ∈ { 0 , 1 } H W M\in\{0, 1\}^{HW} M∈{ 0,1}HW, Is a binary mask, Decide where to give up , Where to use
TransMix
A A A It's from cls token To the input image token Of attention map, Represents each of the patch Importance to the final classification results . For bulls attention, The author uses the average method .
Use attention map Yes label To deal with :

The down arrow indicates the nearest neighbor interpolation , You can put M from HW Size down sampling into p Pixel .
In this case , The network can give label Each point of is based on attention map To dynamically allocate weights .
Pseudo code :

3、 ... and 、 effect



TransMix The visualization is as follows :
The first line shows area-based Of label assignment , hold image A Paste a piece of into B On ,TransMix Able to use attention map Yes label Amendment , It can improve the mutation area label Of weight,


边栏推荐
- 【Transformer】AdaViT: Adaptive Tokens for Efficient Vision Transformer
- Basic use of array -- traverse the circular array to find the maximum value, minimum value, maximum subscript and minimum subscript of the array
- 【比赛网站】收集机器学习/深度学习比赛网站(持续更新)
- Ribbon学习笔记二
- 【Transformer】SOFT: Softmax-free Transformer with Linear Complexity
- 【go】defer的使用
- Huawei 2020 school recruitment written test programming questions read this article is enough (Part 1)
- 与张小姐的春夏秋冬(4)
- Use of file upload (2) -- upload to Alibaba cloud OSS file server
- How to obtain openid of wechat applet in uni app project
猜你喜欢

【综述】图像分类网络

Operation commands in anaconda, such as removing old environment, adding new environment, viewing environment, installing library, cleaning cache, etc
![[ml] PMML of machine learning model -- Overview](/img/a1/cd3eff044d903dbcfb880e854713e5.png)
[ml] PMML of machine learning model -- Overview

Training log 4 of the project "construction of Shandong University mobile Internet development technology teaching website"

【语义分割】语义分割综述

Huawei 2020 school recruitment written test programming questions read this article is enough (Part 2)

ANR优化:导致 OOM 崩溃及相对应的解决方案

Ribbon learning notes 1

Training log II of the project "construction of Shandong University mobile Internet development technology teaching website"

Activity交互问题,你确定都知道?
随机推荐
Flink connector Oracle CDC 实时同步数据到MySQL(Oracle12c)
Isaccessible() method: use reflection techniques to improve your performance several times
有价值的博客、面经收集(持续更新)
Reporting Services- Web Service
How to make interesting apps for deep learning with zero code (suitable for novices)
Markdown syntax
Process management of day02 operation
【ML】机器学习模型之PMML--概述
【Transformer】SOFT: Softmax-free Transformer with Linear Complexity
SSM integration
Operation commands in anaconda, such as removing old environment, adding new environment, viewing environment, installing library, cleaning cache, etc
中海油集团,桌面云&网盘存储系统应用案例
How to obtain openid of wechat applet in uni app project
【目标检测】6、SSD
Flink connector Oracle CDC 实时同步数据到MySQL(Oracle19c)
通过简单的脚本在Linux环境实现Mysql数据库的定时备份(Mysqldump命令备份)
Tear the ORM framework by hand (generic + annotation + reflection)
Detailed explanation of tool classes countdownlatch and cyclicbarrier of concurrent programming learning notes
Semaphore (semaphore) for learning notes of concurrent programming
并发编程学习笔记 之 原子操作类AtomicReference、AtomicStampedReference详解