当前位置:网站首页>[medical segmentation] u2net
[medical segmentation] u2net
2022-07-01 05:38:00 【Coke Daniel】
summary
u2net Is in unet Based on , It is a very effective saliency target detection model .
Significance target detection : Segment the main body of the image .
The model proposed Main background It's two parts :
1、 Split tasks backbone Mainly some pre training models . Because of some segmentation backbone The effect of is not as good as that of the pre training model , So we all use the pre training model . problem : They are not designed for segmentation , Insufficient attention has been paid to the key local details and global comparison information in the segmentation task , So if you want to better adapt to segmentation, you need to add some special structures on this basis , Achieve better use of the extracted features , This brings computational complexity .
2、 The model is generally deep , Considering the overhead of video memory and Computing , We will first do some down sampling operations on the pictures in the top layers , Reduce resolution . problem : In fact, for split tasks , The rich spatial information in high-resolution images needs to be made good use of .
u2net Successfully solved these two problems , A two-level nested u Shape structure , The network can be trained from scratch without relying on the pre training model , The effect of feature extraction is as good as the pre training model , And no additional structure is needed to deal with these features . In addition, after the network deepens , It can also maintain high resolution ( because RSU A large number of pooling operations in ), And the overhead of video memory and computation is also very friendly .
details
Network structure
The following is u2net Network structure , In general, it's still unet Of u Shape structure , But every layer or every stage from unet The simple convolution structure in becomes RUS(Residual U-blocks), Then each decoder The side outputs of all receive gt Supervision of , All side outputs do concat after , The convolution operation is the final network output .
Previous studies have also included stacking or cascading unet obtain u ∗ n − n e t u*n-net u∗n−net, But the author is nested or exponential unuet Went to the , Obviously, the number of nesting can be very large , namely u n n e t u^n net unnet, But considering the actual situation , Or nest one layer to get u 2 n e t u^2 net u2net
Then each RSU The number of layers of encoder With the increase of the number of layers , namely En_1、En_2、En_3、En_4 The used are RSU-7、RSU-6、RSU-5、RSU-4, Because we attach importance to the feature extraction of high-resolution feature map , Pool operation will be performed at the same time , Reduce size . and En_5、En_6 What we use is RSU-4F 了 ,F It means that the size will not change , That is, only feature extraction .
RSU(Residual U-blocks)
RUS To replace the unet Simple convolution block in , It can better capture the overall and local information , And in the past 1x1,3x3 Convolution due to receptive field , Often good at capturing local information , The capture of global information is not so good , And the global information is often needed for segmentation .RUS Through this u The shape structure realizes the mixing of characteristic maps of different scales and different receptive fields , It can capture global information from more different scales .
And he also used the idea of residuals .resnet You need at least two levels in order to do identity mapping , Or we'll do a linear transformation , and n The linear change effect of degree is equivalent to 1 Linear transformation of degree . And in this article , Because of this u-block It contains several layers , So cross one block That's it .
Then there is the calculation quantity , The author compares some mainstream block structures , Discover though RSU The amount of calculation is linear with the increase of depth , But the coefficient is very small , Therefore, the amount of calculation is not very large , It can be stacked very deep .
Loss
There are mainly two parts , One part is the loss of side output characteristic graph , The other part is the loss of the final output feature map formed after the fusion of these side outputs .

边栏推荐
- 如何创建一个根据进度改变颜色的进度条
- Mongodb學習篇:安裝後的入門第一課
- CockroachDB: The Resilient Geo-Distributed SQL Database 论文阅读笔记
- Educational administration management system (free source code)
- Precautions for use of conductive slip ring
- TypeORM 框架
- JDBC常见面试题
- Cockroachdb: the resistant geo distributed SQL database paper reading notes
- College community management system based on boot+jsp (with source code download link)
- 在Rainbond中一键部署高可用 EMQX 集群
猜你喜欢

数字金额加逗号;js给数字加三位一逗号间隔的两种方法;js数据格式化

Dynamic verification of new form items in El form; El form verifies that the dynamic form V-IF does not take effect;

mysql 将毫秒数转为时间字符串

Educational administration management system (free source code)

Unity project experience summary

boot+jsp的高校社团管理系统(附源码下载链接)

在Rainbond中一键部署高可用 EMQX 集群
![[excel] column operation, which performs specific column for data in a cell, such as text division by comma, colon, space, etc](/img/c8/e3e31ad9ef214d97228cb501dd752f.jpg)
[excel] column operation, which performs specific column for data in a cell, such as text division by comma, colon, space, etc

Detailed explanation of set

boot+jsp的高校社團管理系統(附源碼下載鏈接)
随机推荐
数据库连接池的简单实现
数据治理:元数据管理实施(第四篇)
More than one file was found with OS independent path ‘lib/armeabi-v7a/libyuv. so‘.
3D建模與處理軟件簡介 劉利剛 中國科技大學
Fluentd is easy to use. Combined with the rainbow plug-in market, log collection is faster
Spanner 论文小结
[SRS] use of Vhost isolated stream: push / pull Stream Address
Floweable source code annotation (40) class delegation
vsCode函数注解/文件头部注解快捷键
JSON data comparer
Unity project experience summary
printk 调试总结
busybox生成的东西
HDU - 1024 Max Sum Plus Plus(DP)
Qt编译时,出现 first defined here,原因及解决方法
激活函数简述
Intelligent operation and maintenance: visual management system based on BIM Technology
Summary of spanner's paper
[RootersCTF2019]babyWeb
云原生存储解决方案Rook-Ceph与Rainbond结合的实践