当前位置:网站首页>[paper notes] transunet: transformers make strongencoders for medical image segmentation
[paper notes] transunet: transformers make strongencoders for medical image segmentation
2022-07-06 18:52:00 【come from γ Saiya of stars】
Statement
Update your papers from time to time , Easy to understand , Junior Xiaobai can also understand
Scope of coverage : In depth learning direction , Include CV、NLP、Data Fusion、Digital Twin
Paper title :
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
TransUNet: Transformer Provide a powerful encoder for medical image segmentation
Thesis link :https://arxiv.org/abs/2102.04306
Paper code :https://github.com/Beckschen/TransUNet
Time of publication :2021 year 2 month
Innovation points
1、 introduce Transformer and U-Net Combining the Internet , structure TransUNet The Internet
Abstract
Medical image segmentation is to develop medical care system , In particular, the necessary prerequisites for disease diagnosis and treatment planning . In various medical image segmentation tasks ,U Shape architecture ( also called U-NET) Has become the de facto standard , And it was a great success . However , Due to the inherent locality of convolution ,U-NET It usually shows limitations in explicitly modeling remote dependencies . The converter designed for sequence to sequence prediction has become an alternative architecture with innate global self-attention mechanism , But due to the lack of underlying details , It may lead to limited positioning capability .
In this paper, TransUNet As a powerful alternative to medical image segmentation , It has both Transformers and U-net The advantages of . One side , The converter will convolute Neural Networks (CNN) The marked image block in the feature map is encoded as an input sequence , Used to extract global context . On the other hand , The decoder upsamples the encoded features , Then the coding features are combined with high resolution CNN Feature mapping , For precise positioning .
We think , Transformer can be used as a strong encoder for medical image segmentation tasks , And combine U-NET Enhance finer details by restoring local spatial information .Transunet It has achieved better performance than various competitive methods in medical applications such as multi organ segmentation and heart segmentation . Codes and models can be found in https://github.com/beckschen/transunet get .
Method
First , The input image is down sampled and 3 Iteration of layer convolution , The characteristics of generation , Conduct Flatten operation ;
then ,Flatten After the feature enters 12 layer Transformer, there Transformer Inside the structure is MSA ( The long attention mechanism ),MLP ( Fully connected layer ) Then the output ;
Here's an explanation , Why convolute first and then Transformer .
Because ,Transformer The drawback is that , Large amount of computation , And there is no spatial information . Advantage is , Have global information .
The disadvantage of convolution is , Unable to synthesize global information , And the advantage is , After convolution , Fewer parameters , And it has local spatial information , Different convolution kernels have different receptive fields .
therefore , The author puts convolution in Transformer In front of the structure , Combine their advantages and disadvantages , Reduced parameters , And has spatial and global information .
Last , Decoder part and U-Net identical ,reshape After that, conduct four times of upper sampling , Then it is compared with the characteristics of the encoder's three down sampling Concatenation operation , Finally, output the segmentation graph .
Experiments
The goal of the experiment : Different data sets , Comparison results of different codec structures
experimental result :TransUNet The result is the best
The goal of the experiment : Split the result graph
The goal of the experiment : Comparison of different frameworks
experimental result : TransUNet Have distinct advantages
At the end
Transofrmer Large data sets are required , But medical data sets are not easy to collect , This may be a limitation Transformer One of the problems of development in the medical field !
边栏推荐
- Unlock 2 live broadcast themes in advance! Today, I will teach you how to complete software package integration Issues 29-30
- Markdown syntax for document editing (typera)
- 根据PPG估算血压利用频谱谱-时间深度神经网络【翻】
- Docker安装Redis
- Three years of Android development, Android interview experience and real questions sorting of eight major manufacturers during the 2022 epidemic
- 2022/02/12
- If you have any problems, you can contact me. A rookie ~
- Understanding disentangling in β- VAE paper reading notes
- The role of applet in industrial Internet
- helm部署etcd集群
猜你喜欢
爬虫玩得好,牢饭吃到饱?这3条底线千万不能碰!
Introduction to the use of SAP Fiori application index tool and SAP Fiori tools
Numerical analysis: least squares and ridge regression (pytoch Implementation)
多线程基础:线程基本概念与线程的创建
How are you in the first half of the year occupied by the epidemic| Mid 2022 summary
Estimate blood pressure according to PPG using spectral spectrum time depth neural network [turn]
关于静态类型、动态类型、id、instancetype
根据PPG估算血压利用频谱谱-时间深度神经网络【翻】
一种用于夜间和无袖测量血压手臂可穿戴设备【翻译】
ORACLE进阶(四)表连接讲解
随机推荐
抽象类与抽象方法
Nuc11 cheetah Canyon setting U disk startup
Collection of penetration test information -- use with nmap and other tools
Human bone point detection: top-down (part of the theory)
Precautions for binding shortcut keys of QPushButton
[depth first search] Ji suanke: a joke of replacement
The role of applet in industrial Internet
MySQL查询请求的执行过程——底层原理
ORACLE进阶(四)表连接讲解
Introduction and case analysis of Prophet model
Oracle advanced (IV) table connection explanation
测试行业的小伙伴,有问题可以找我哈。菜鸟一枚~
涂鸦智能在香港双重主板上市:市值112亿港元 年营收3亿美元
Docker installation redis
Markdown syntax for document editing (typera)
On AAE
Reproduce ThinkPHP 2 X Arbitrary Code Execution Vulnerability
44 colleges and universities were selected! Publicity of distributed intelligent computing project list
Penetration test information collection - basic enterprise information
Example of implementing web server with stm32+enc28j60+uip protocol stack