当前位置:网站首页>14、Transformer--VIT TNT BETR
14、Transformer--VIT TNT BETR
2022-07-05 20:29:00 【C--G】
VIT–Vision Transformer
VIT Architecture diagram
VIT For image classification tasks , It's used here transformer The encoder , Divide the picture into nine pieces , Add the position coding, convert it into one dimension, and then put it into the encoder , Encoder has 9 Inputs token, among 0 Number token And others 9 position token Interactive calculation , It integrates other 9 position token Characteristic information of , So we just need 0 Number token that will do , The back is MLP Head And classification
- CNN The problem of
- transformer advantage
- The formula
- VIT pattern
- Location code
- Effect analysis
- Code link
https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_classification/vision_transformer
TNT-Transformer in Transformer
- Basic composition
- Sequence construction
- Basic calculation
- Location code
- PatchEmbedding visualization
BETR
object detection
The basic idea
Parallel prediction 100 A coordinate box , No objects , That's the backgroundNetwork architecture
cnn Obtain one-dimensional characteristic graph ,positional encoding Get location code , And VIT Different ,BETR No, 0 Number token, With the traditional Transformer Decoder Different ,BETR By object queries How many coordinate frames are generated at a time , Each box is in parallel with encoder Output to match , Re pass prediction heads Determine whether it is the target box
Encoder The task of
encoder The result of providing attention to goals is better than cnn The result of characteristic graph , It is conducive to the decoder to quickly identify the target , As shown in the figure ,encoder It can also recognize objects well in case of occlusionNetwork architecture
Output match
The role of attention
Google source code
https://github.com/google-research/bertData resources – Big guy's blog
https://blog.csdn.net/qq_37774399/article/details/121748163
边栏推荐
- kubernetes资源对象介绍及常用命令(五)-(ConfigMap&Secret)
- ICTCLAS word Lucene 4.9 binding
- Leetcode(347)——前 K 个高频元素
- 计算lnx的一种方式
- 14、Transformer--VIT TNT BETR
- 【刷题记录】1. 两数之和
- Leetcode(695)——岛屿的最大面积
- 炒股开户最低佣金,低佣金开户去哪里手机上开户安全吗
- [quick start of Digital IC Verification] 3. Introduction to the whole process of Digital IC Design
- Ffplay document [easy to understand]
猜你喜欢
【数字IC验证快速入门】1、浅谈数字IC验证,了解专栏内容,明确学习目标
鸿蒙os第四次学习
【刷题记录】1. 两数之和
走入并行的世界
About the priority of Bram IP reset
A solution to PHP's inability to convert strings into JSON
1. Strengthen learning basic knowledge points
js实现禁止网页缩放(Ctrl+鼠标、+、-缩放有效亲测)
【数字IC验证快速入门】9、Verilog RTL设计必会的有限状态机(FSM)
Rainbow 5.7.1 supports docking with multiple public clouds and clusters for abnormal alarms
随机推荐
Codeforces Round #804 (Div. 2) - A, B, C
Scala基础【HelloWorld代码解析,变量和标识符】
2020 CCPC Weihai - A. golden spirit (thinking), D. ABC project (big number decomposition / thinking)
Is it safe for CICC fortune to open an account online?
Leetcode brush question: binary tree 13 (the same tree)
sort和投影
Hong Kong stocks will welcome the "best ten yuan store". Can famous creative products break through through the IPO?
【愚公系列】2022年7月 Go教学课程 004-Go代码注释
Classic implementation of the basic method of intelligent home of Internet of things
Process file and directory names
. Net distributed transaction and landing solution
Guidelines for application of Shenzhen green and low carbon industry support plan in 2023
model方法
National Eye Care Education Conference, 2022 the Fourth Beijing International Youth eye health industry exhibition
CVPR 2022 | 常见3D损坏和数据增强
Leetcode(695)——岛屿的最大面积
How to select the Block Editor? Impression notes verse, notation, flowus
怎么挑选好的外盘平台,安全正规的?
js方法传Long类型id值时会出现精确损失
【数字IC验证快速入门】1、浅谈数字IC验证,了解专栏内容,明确学习目标