当前位置:网站首页>[paper notes] transunet: transformers make strongencoders for medical image segmentation
[paper notes] transunet: transformers make strongencoders for medical image segmentation
2022-07-06 18:52:00 【come from γ Saiya of stars】
Statement
Update your papers from time to time , Easy to understand , Junior Xiaobai can also understand
Scope of coverage : In depth learning direction , Include CV、NLP、Data Fusion、Digital Twin
Paper title :
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
TransUNet: Transformer Provide a powerful encoder for medical image segmentation
Thesis link :https://arxiv.org/abs/2102.04306
Paper code :https://github.com/Beckschen/TransUNet
Time of publication :2021 year 2 month
Innovation points
1、 introduce Transformer and U-Net Combining the Internet , structure TransUNet The Internet
Abstract
Medical image segmentation is to develop medical care system , In particular, the necessary prerequisites for disease diagnosis and treatment planning . In various medical image segmentation tasks ,U Shape architecture ( also called U-NET) Has become the de facto standard , And it was a great success . However , Due to the inherent locality of convolution ,U-NET It usually shows limitations in explicitly modeling remote dependencies . The converter designed for sequence to sequence prediction has become an alternative architecture with innate global self-attention mechanism , But due to the lack of underlying details , It may lead to limited positioning capability .
In this paper, TransUNet As a powerful alternative to medical image segmentation , It has both Transformers and U-net The advantages of . One side , The converter will convolute Neural Networks (CNN) The marked image block in the feature map is encoded as an input sequence , Used to extract global context . On the other hand , The decoder upsamples the encoded features , Then the coding features are combined with high resolution CNN Feature mapping , For precise positioning .
We think , Transformer can be used as a strong encoder for medical image segmentation tasks , And combine U-NET Enhance finer details by restoring local spatial information .Transunet It has achieved better performance than various competitive methods in medical applications such as multi organ segmentation and heart segmentation . Codes and models can be found in https://github.com/beckschen/transunet get .
Method
First , The input image is down sampled and 3 Iteration of layer convolution , The characteristics of generation , Conduct Flatten operation ;
then ,Flatten After the feature enters 12 layer Transformer, there Transformer Inside the structure is MSA ( The long attention mechanism ),MLP ( Fully connected layer ) Then the output ;
Here's an explanation , Why convolute first and then Transformer .
Because ,Transformer The drawback is that , Large amount of computation , And there is no spatial information . Advantage is , Have global information .
The disadvantage of convolution is , Unable to synthesize global information , And the advantage is , After convolution , Fewer parameters , And it has local spatial information , Different convolution kernels have different receptive fields .
therefore , The author puts convolution in Transformer In front of the structure , Combine their advantages and disadvantages , Reduced parameters , And has spatial and global information .
Last , Decoder part and U-Net identical ,reshape After that, conduct four times of upper sampling , Then it is compared with the characteristics of the encoder's three down sampling Concatenation operation , Finally, output the segmentation graph .
Experiments
The goal of the experiment : Different data sets , Comparison results of different codec structures
experimental result :TransUNet The result is the best
The goal of the experiment : Split the result graph
The goal of the experiment : Comparison of different frameworks
experimental result : TransUNet Have distinct advantages
At the end
Transofrmer Large data sets are required , But medical data sets are not easy to collect , This may be a limitation Transformer One of the problems of development in the medical field !
边栏推荐
- 文档编辑之markdown语法(typora)
- [depth first search] Ji suanke: Square
- How to improve website weight
- Cocos2d Lua smaller and smaller sample memory game
- AUTOCAD——中心线绘制、CAD默认线宽是多少?可以修改吗?
- 渲大师携手向日葵,远控赋能云渲染及GPU算力服务
- 能源行业的数字化“新”运维
- 2022.2.12
- 人体骨骼点检测:自顶向下(部分理论)
- Unlock 2 live broadcast themes in advance! Today, I will teach you how to complete software package integration Issues 29-30
猜你喜欢
提前解锁 2 大直播主题!今天手把手教你如何完成软件包集成?|第 29-30 期
None of the strongest kings in the monitoring industry!
Docker installation redis
如何提高网站权重
C#/VB. Net to add text / image watermarks to PDF documents
根据PPG估算血压利用频谱谱-时间深度神经网络【翻】
Master Xuan joined hands with sunflower to remotely control enabling cloud rendering and GPU computing services
On AAE
AUTOCAD——中心线绘制、CAD默认线宽是多少?可以修改吗?
二叉搜索树
随机推荐
Implementation of AVL tree
Wx applet learning notes day01
徐翔妻子应莹回应“股评”:自己写的!
【LeetCode第 300 场周赛】
Penetration test information collection - basic enterprise information
Helm deploy etcd cluster
上海部分招工市場對新冠陽性康複者拒絕招錄
[depth first search] Ji suanke: find numbers
Medical image segmentation
pychrm社区版调用matplotlib.pyplot.imshow()函数图像不弹出的解决方法
基于蝴蝶种类识别
From 2022 to 2024, the list of cifar azrieli global scholars was announced, and 18 young scholars joined 6 research projects
基于ppg和fft神经网络的光学血压估计【翻译】
Bonecp uses data sources
一种用于夜间和无袖测量血压手臂可穿戴设备【翻译】
上海部分招工市场对新冠阳性康复者拒绝招录
AcWing 3537. Tree lookup complete binary tree
Estimate blood pressure according to PPG using spectral spectrum time depth neural network [turn]
node の SQLite
安装及管理程序