当前位置:网站首页>14、Transformer--VIT TNT BETR
14、Transformer--VIT TNT BETR
2022-07-05 20:29:00 【C--G】
VIT–Vision Transformer
VIT Architecture diagram
VIT For image classification tasks , It's used here transformer The encoder , Divide the picture into nine pieces , Add the position coding, convert it into one dimension, and then put it into the encoder , Encoder has 9 Inputs token, among 0 Number token And others 9 position token Interactive calculation , It integrates other 9 position token Characteristic information of , So we just need 0 Number token that will do , The back is MLP Head And classification
- CNN The problem of
- transformer advantage
- The formula
- VIT pattern
- Location code
- Effect analysis
- Code link
https://github.com/WZMIAOMIAO/deep-learning-for-image-processing/tree/master/pytorch_classification/vision_transformer
TNT-Transformer in Transformer
- Basic composition
- Sequence construction
- Basic calculation
- Location code
- PatchEmbedding visualization
BETR
object detection
The basic idea
Parallel prediction 100 A coordinate box , No objects , That's the backgroundNetwork architecture
cnn Obtain one-dimensional characteristic graph ,positional encoding Get location code , And VIT Different ,BETR No, 0 Number token, With the traditional Transformer Decoder Different ,BETR By object queries How many coordinate frames are generated at a time , Each box is in parallel with encoder Output to match , Re pass prediction heads Determine whether it is the target box
Encoder The task of
encoder The result of providing attention to goals is better than cnn The result of characteristic graph , It is conducive to the decoder to quickly identify the target , As shown in the figure ,encoder It can also recognize objects well in case of occlusionNetwork architecture
Output match
The role of attention
Google source code
https://github.com/google-research/bertData resources – Big guy's blog
https://blog.csdn.net/qq_37774399/article/details/121748163
边栏推荐
- 2.8 basic knowledge of project management process
- 信息学奥赛一本通 1337:【例3-2】单词查找树 | 洛谷 P5755 [NOI2000] 单词查找树
- 2020 CCPC 威海 - A. Golden Spirit(思维),D. ABC Conjecture(大数分解 / 思维)
- ByteDance dev better technology salon was successfully held, and we joined hands with Huatai to share our experience in improving the efficiency of web research and development
- 死信队列入门(两个消费者,一个生产者)
- .Net分布式事务及落地解决方案
- Informatics Orsay all in one 1339: [example 3-4] find the post order traversal | Valley p1827 [usaco3.4] American Heritage
- 1、强化学习基础知识点
- Scala基础【HelloWorld代码解析,变量和标识符】
- 【刷题记录】1. 两数之和
猜你喜欢
随机推荐
[Yugong series] go teaching course in July 2022 004 go code Notes
How to form standard interface documents
港股将迎“最牛十元店“,名创优品能借IPO突围?
Oracle tablespace management
How to choose a good external disk platform, safe and formal?
Classic implementation of the basic method of intelligent home of Internet of things
[Yugong series] go teaching course in July 2022 004 go code Notes
14、Transformer--VIT TNT BETR
[record of question brushing] 1 Sum of two numbers
实操演示:产研团队如何高效构建需求工作流?
Unity编辑器扩展 UI控件篇
Ros2 topic [01]: installing ros2 on win10
欢迎来战,赢取丰厚奖金:Code Golf 代码高尔夫挑战赛正式启动
2022年7月4日-2022年7月10日(ue4视频教程mysql)
CVPR 2022 | common 3D damage and data enhancement
July 4, 2022 - July 10, 2022 (UE4 video tutorial MySQL)
Pytorch 1.12 was released, officially supporting Apple M1 chip GPU acceleration and repairing many bugs
ICTCLAS word Lucene 4.9 binding
Scala basics [HelloWorld code parsing, variables and identifiers]
2022 Beijing eye health products exhibition, eye care products exhibition, China eye Expo held in November