当前位置:网站首页>Swing transformer details-2
Swing transformer details-2
2022-07-03 10:00:00 【Star soul is not a dream】


1. cyclic shift + reverse cyclic shift

depth = [2, 2, 6, 2] # MSA The number of
[SwinTransformerBlock(...,shift_size=0 if (i % 2 == 0) else window_size // 2,...)
for i in range(depth)] # window_sizetherefore , In the diagram above , W-MSA Of shift_size = 0,SW-MSA in mask=self.attn_mask, shift_size = 7 // 2 = 3.
window_size = 7
shift_size = 7 // 2
''' Construct multidimensional tensor '''
# x=np.arange(14*14*4*96).reshape(1,14,14,96*4)
x=np.arange(14*14).reshape(14,14)
x=torch.from_numpy(x)
print(x)
if shift_size > 0:
shifted_x = torch.roll(x, shifts=(-shift_size, -shift_size), dims=(0, 1))
#shifted_x = torch.roll(x, shifts=(-self.shift_size, -self.shift_size), dims=(1, 2))
print("---------cyclic shift---------")
else:
shifted_x = x
print(shifted_x)
# reverse cyclic shift
if shift_size > 0:
x = torch.roll(shifted_x, shifts=(shift_size, shift_size), dims=(0, 1))
print("---------reverse cyclic shift---------")
print(x)
else:
x = shifted_x
2. SW-MSA
shift_size = 3
window_size = 7
if shift_size > 0:
input_resolution = (14, 14)
# calculate attention mask for SW-MSA
H, W = input_resolution
# img_mask = torch.zeros((H, W)) # H W
img_mask = torch.zeros((1, H, W, 1)) # 1 H W 1
h_slices = (slice(0, -window_size),
slice(-window_size, -shift_size),
slice(-shift_size, None))
w_slices = (slice(0, -window_size),
slice(-window_size, -shift_size),
slice(-shift_size, None))
cnt = 0
for h in h_slices:
for w in w_slices:
# img_mask[h, w] = cnt
img_mask[:, h, w, :] = cnt
cnt += 1
mask_windows = window_partition(img_mask, window_size) # nW, window_size, window_size , here nW = 4
outputs = mask_windows.view(-1, window_size, window_size)
outputs_1 = torch.stack((outputs[0], outputs[1]), dim=1).view(-1, window_size, window_size*2)
outputs_2 = torch.stack((outputs[2], outputs[3]), dim=1).view(-1, window_size, window_size*2)
outputs = torch.stack((outputs_1, outputs_2), dim=1).view(-1, H, W)
print(outputs)
mask_windows = mask_windows.view(-1, window_size * window_size) # nW, window_size * window_size
print(mask_windows)
attn_mask = mask_windows.unsqueeze(1) - mask_windows.unsqueeze(2) # (nW, 1, window_size * window_size) - (nW, window_size * window_size, 1)
# radio broadcast -> (nW, window_size * window_size, window_size * window_size) - (nW, window_size * window_size, window_size * window_size)
print(attn_mask[1][0] == attn_mask[1][4])
print(attn_mask[1][4].view(window_size, window_size))
attn_mask = attn_mask.masked_fill(attn_mask != 0, float(-100.0)).masked_fill(attn_mask == 0, float(0.0)) # take Not 0 The replacement of is -100, 0 Replace with 0.
print(attn_mask.shape)
attn_windows = self.attn(x_windows, mask=self.attn_mask) # nW*B, window_size*window_size, C
mask = attn_mask
if mask is not None:
nW = mask.shape[0] # How many are a picture divided into windows
attn = attn.view(B_ // nW, nW, self.num_heads, N, N) + mask.unsqueeze(1).unsqueeze(0) # torch.Size([128, 4, 12, 49, 49]) torch.Size([1, 4, 1, 49, 49]) radio broadcast
attn = attn.view(-1, self.num_heads, N, N)
attn = self.softmax(attn) Use attn + mask , Make the current position ( Like above [1][4]), Non adjacent areas in the image -100, Equivalent to calculating softmax Do not consider these areas .
3. W-MSA and MSA Complexity comparison of + 4. Overall flow chart
Please refer to : Detailed explanation of the paper :Swin Transformer - You know
边栏推荐
- Successful graduation [3]- blog system update...
- Schematic diagram and connection method of six pin self-locking switch
- CEF download, compile project
- 自動裝箱與拆箱了解嗎?原理是什麼?
- el-table X轴方向(横向)滚动条默认滑到右边
- Not many people can finally bring their interests to college graduation
- The third paper of information system project manager in soft examination
- STM32 external interrupt experiment
- QT qcombobox QSS style settings
- Serial communication based on 51 single chip microcomputer
猜你喜欢

03 fastjason solves circular references
![Successful graduation [2] - student health management system function development...](/img/91/72cdea3eb3f61315595330d2c9016d.png)
Successful graduation [2] - student health management system function development...

开学实验里要用到mysql,忘记基本的select语句怎么玩啦?补救来啦~

CEF下载,编译工程

Oracle database SQL statement execution plan, statement tracking and optimization instance

Fundamentals of Electronic Technology (III)__ Fundamentals of circuit analysis__ Basic amplifier operating principle

Vector processor 9_ Basic multilevel interconnection network
![[CSDN] C1 training problem analysis_ Part II_ Web Foundation](/img/91/72cdea3eb3f61315595330d2c9016d.png)
[CSDN] C1 training problem analysis_ Part II_ Web Foundation

SCM is now overwhelming, a wide variety, so that developers are overwhelmed

Runtime. getRuntime(). GC () and runtime getRuntime(). The difference between runfinalization()
随机推荐
STM32 serial port usart1 routine
Exception handling of arm
IDEA远程断点调试jar包项目
Programming ideas are more important than anything, not more than who can use several functions, but more than the understanding of the program
[CSDN] C1 training problem analysis_ Part IV_ Advanced web
Emballage automatique et déballage compris? Quel est le principe?
The third paper of information system project manager in soft examination
The 4G module designed by the charging pile obtains NTP time through mqtt based on 4G network
MYSQL数据库底层基础专栏
SCM is now overwhelming, a wide variety, so that developers are overwhelmed
Gpiof6, 7, 8 configuration
要選擇那種語言為單片機編寫程序呢
Project cost management__ Cost management technology__ Article 8 performance review
內存數據庫究竟是如何發揮內存優勢的?
STM32 port multiplexing and remapping
Interruption system of 51 single chip microcomputer
4G module IMEI of charging pile design
MySQL 数据库基础知识(系统化一篇入门)
Drive and control program of Dianchuan charging board for charging pile design
Notes on C language learning of migrant workers majoring in electronic information engineering