当前位置:网站首页>【ARXIV2204】Vision Transformers for Single Image Dehazing
【ARXIV2204】Vision Transformers for Single Image Dehazing
2022-07-28 05:00:00 【AI frontier theory group @ouc】

The paper :https://arxiv.org/abs/2204.03883
Code :https://github.com/IDKiro/DehazeFormer
1、 Research motivation
The author puts forward DehazeFormer For image defogging , Inspiration comes from Swin Transformer , The interesting part of the paper is reflection padding and The calculation of attention
2、 The main method
The method framework is shown in the figure below , It's a 5 Stage UNET structure , Convolution block is DehazeFormer block replace .

Reflection padding
stay SWIN in , Use shfited window To realize the interaction of information between windows , But the author believes that this operation is not friendly to the image edge region . For classification tasks , The target area is always in the middle of the image , Therefore use shift window There is no problem , But for the image restoration task , Marginal areas are equally important , Such operation is inappropriate . So , The author puts forward reflection padding operation , As shown in the figure below .

The input image size is 8x8, In the picture window yes 4x4 Of , So for the edge area replication 2 individual patch, The image size becomes 12x12, In this way, it can become 3x3=9 individual window. Here 9 individual window Local calculation in attention, After the calculation , Put the middle 8x8 Cut out the area of .
The authors also point out that , Such operations will cause the consumption of computing and memory resources .
W-MHSA with parallel convolution
The author believes that due to MHSA The aggregation weight of is dynamic and normalized , The author believes that static 、 Learnable and unconstrained aggregation weights help complement MHSA. So the author is right V Additional convolution is performed . You can also see in the overall architecture diagram of the paper V There is a convolution layer behind , And attention Add the calculation result of .
The experimental part can refer to the author's paper , There is not much here .
边栏推荐
- Test report don't step on the pit
- [Oracle] 083 wrong question set
- 在外包公司两年了,感觉快要废了
- HDU 3078 network (lca+ sort)
- Easycvr Video Square snapshot adding device channel offline reason display
- (clone virtual machine steps)
- How to send and receive reports through outlook in FastReport VCL?
- 机器人教育在STEM课程中的设计研究
- HDU 3585 maximum shortest distance
- CPU and memory usage are too high. How to modify RTSP round robin detection parameters to reduce server consumption?
猜你喜欢

Leetcode 454. Adding four numbers II

RT_ Use of thread mailbox

多御安全浏览器将改进安全模式,让用户浏览更安全

Do you know several assertion methods commonly used by JMeter?

Dcgan:deep volume general adaptive networks -- paper analysis

Use animatedbuilder to separate components and animation, and realize dynamic reuse

如何在 FastReport VCL 中通过 Outlook 发送和接收报告?

What is the core value of testing?

解析智能扫地机器人中蕴含的情感元素

Redux basic syntax
随机推荐
(克隆虚拟机步骤)
The difference between alter and confirm, prompt
linux下安装mysql
Is low code the future of development? On low code platform
(3.1) [Trojan horse synthesis technology]
With a monthly salary of 15.5K, he failed to start a business and was heavily in debt. How did he reverse the trend through software testing?
Evolution of ape counseling technology: helping teaching and learning conceive future schools
alter和confirm,prompt的区别
Research on the design of robot education in stem course
Interview fraud: there are companies that make money from interviews
Handling of web page image loading errors
Do you know several assertion methods commonly used by JMeter?
【CPU占用高】software_reporter_tool.exe
Look at the experience of n-year software testing summarized by people who came over the test
list indices must be integers or slices, not tuple
Applet import project
解析智能扫地机器人中蕴含的情感元素
FPGA: use PWM wave to control LED brightness
App test process and test points
Visual studio 2019 new OpenGL project does not need to reconfigure the environment