当前位置:网站首页>Tensorflow introductory tutorial (38) -- V2 net
Tensorflow introductory tutorial (38) -- V2 net
2022-07-24 17:33:00 【51CTO】
Today we will share Unet An improved model of U2-Net, The improved model comes from 2020 Year paper 《U2-Net: Going Deeper with Nested U-Structure for Salient Object Detection》, By understanding the idea of the model , stay VNet On the basis of this, we can make the same improvement .
One 、U2-Net The advantages of
1、U2-Net It is a simple but powerful deep learning network structure , For significant object detection . It consists of two closely connected U Structural . The design has the following advantages (1)、 residual U structure (RSU) Mixed with receptive fields of different sizes , Therefore, more contextual information can be captured from different scales .(2)、 Because of these RSU Pooling operation is used in the module , Therefore, the depth of the entire network can be increased without significantly increasing the computing cost .
2、 For two questions U2-Net The Internet ,(1)、 Whether we can design a new network to train from scratch , The final result or model performance is better than that of the existing and pre training models ?(2)、 With the deepening of the network, can we maintain high-resolution feature map , While maintaining low memory and computing consumption .
Two 、U2-Net Network structure
1、 residual U modular
The following figure shows a common convolution module , however a To c Only local features can be obtained , Because the convolution kernel size is too small , Unable to capture global characteristics . In order to obtain the global information of high-resolution feature map , The most direct idea is to expand the receptive field , Pictured d Shown , Using hole convolution to expand receptive field to extract local and nonlocal features , But this requires more computing power and memory consumption . according to Unet Thought , The residual error U modular , It consists of three parts :a、 Enter the build-up layer , This is a conventional convolution layer used to extract local features ,b、 Be similar to Unet Structure coding - Decoding structure network , The input is the output of the conventional convolution , It can be used to extract and encode multi-scale context information ,L Indicates the level depth , The bigger the value is. , The larger the range of receptive field , There will be richer local and global features . Gradually pool downsampling from the input feature map and convolute to extract multi-scale features , Then through continuous and step-by-step sampling , Mosaic and convolution encode it into high-resolution feature map ,c、 Residual connection is the fusion of local features and multi-scale features .
The design idea is to extract multi-scale features directly from each residual module , because U Very small structure , Most operations are on downsampling , It's very efficient .

2、U2-Net structure
U2-Net By 11 The three hierarchies are similar U Structured network . Each hierarchy is the residual of personalized configuration U Structural modules (RSU). So nest U Structure can extract multi-scale features within the hierarchy more effectively , Aggregate multi-scale features between levels .U2-Net It consists of three parts ,a、 Six levels of encoder modules , Residuals are used U Module structure , among L It is determined according to the resolution of the input feature map , The first four of these six levels are in the pooled layer version RSU, And the latter two levels adopt the hole convolution version RSU, Because the later the resolution is, the lower , Using pooling layer will lose information .b、 Five levels of decoder module , It has the same structure as the corresponding encoder module , Each module input is the result of the previous module output after up sampling and the output result of the corresponding encoder at the same level .c、 The output characteristic probability map fusion module of the encoder module , Six characteristic probability graphs pass 3x3 Convolution sum sigmoid Function to produce , Then the six feature probability maps are reduced to the size of the original image , And carry out splicing operation , after 1x1 Convolution sum sigmoid Function to generate the fused probability map .

3、 ... and 、 Experimental details and results
1、 The evaluation index :PR curve , Maximum F measurement , Mean error of absolute value , weighting F measurement , Structural measurement , Boundary related F measurement .
2、 Training process , The original image is first uniformly scaled to 320x320 size , Then randomly flip and cut 288x288 size . All convolution layer weights are initialized using Xavier. Use the deep supervision mode to train the model , The cross entropy method is used for the output result of each decoder and the final fusion output result and the gold standard result , And give different weights as the loss function , In the paper, the author sets all loss The weights are all 1, use Adam Optimizer and take default parameters . The image is scaled by bilinear interpolation .
3、 Results comparison
U2-Net And 20 Comparison of two methods , On six data sets , It is the best result in both qualitative and quantitative measurement .
边栏推荐
- Internship report 1 - face 3D reconstruction method
- Image information is displayed by browser: data:image/png; Base64, + image content
- I'll teach you how to use NPs to build intranet penetration services. When you go out, you can easily connect your lightweight notebook to your home game console to play remotely
- Is it safe for qiniu to open an account?
- 别再到处乱放配置文件了!试试我司使用 7 年的这套解决方案,稳的一秕
- Bring 120W goods in 15 seconds. You can also shoot such a popular video
- Are the top ten securities companies safe and risky to open accounts?
- Opencv has its own color operation
- TCP protocol debugging tool tcpengine v1.3.0 tutorial
- AI opportunities for operators: expand new tracks with large models
猜你喜欢

Internet Download Manager Configuration

Portfwd port forwarding

Scept: consistent and strategy based trajectory prediction for planned scenarios

Opencv has its own color operation

还在用Xshell?你out了,推荐一个更现代的终端连接工具!

Can Lu Zhengyao hide from the live broadcast room dominated by Luo min?

Stop littering configuration files everywhere! Try our 7-year-old solution, which is stable

Hcip day 3

portmap 端口转发

Cann training camp learns the animation stylization and AOE ATC tuning of the second season of 2022 model series
随机推荐
ufw 端口转发
调整图像亮度的滚动条演示实验
Are the top ten securities companies safe and risky to open accounts?
Shardingsphere database read / write separation
awk从入门到入土(17)awk多行写法
Is Shenwan Hongyuan securities' low commission account reliable, reliable and safe
Still shocked by the explosion in the movie? Then you must not miss this explosive plug-in of unity
AI opportunities for operators: expand new tracks with large models
Method of querying comma separated strings in a field by MySQL
Getaverse, a distant bridge to Web3
Is it safe for Mr. qiniu to open a securities account? Can I drive it?
实习报告1——人脸三维重建方法
Eth POS 2.0 stacking test network pledge process
CDN (content delivery network) content distribution network from entry to practice
电脑监控是真的吗?4个实验一探究竟
DHCP relay of HCNP Routing & Switching
What should we pay attention to in the resume of software testing?
Getaverse, a distant bridge to Web3
Pat class A - A + B format
HCNP Routing&Switching之DHCP中继