当前位置:网站首页>DETR介绍
DETR介绍
2022-07-07 11:07:00 【算法之名】
DETR是facebook发表于ECCV2020的使用Transformers进行端到端的目标检测的框架。
DETR只需要使用CNN提取图像特征,再单独使用Transformer就可以预测出目标边界框和分类。它不需要非极大值抑制,也不需要Anchor机制。
上图是DETR的网络架构图,DETR使用CNN提取图像特征,再单独使用Transformer得到预测出目标边界框,边界框和ground truth看作是一个几何预测问题。就是一个二分的匹配(bipartite matching),没有匹配上的物体归位no object这一类。
上图是更详细的描述DETR的网络结构,图像经过CNN获取到特征,再加上位置编码(poositioonal encoding),然后再展平送入到transformer encoder,encoder的输出再送入到transformer decoder,在decoder中还有object queries的输入,decoder的输出送入预测头(prediction heads),预测头中有前馈神经网络FFN进行物体类别和边界框的预测。
上图是DETR中Transformer具体的架构,它有Encoder和Decoder两部分,Encoder的输入就是CNN提取的图像特征加上位置编码,送入多头自注意力模块,再送入前馈神经网络模块。这样的Encoder层可以有多个,然后再送入Decoder,Decoder有Object queries,是可学习的位置嵌入作为输入,经过多头自注意力模块,再经过Encoder和Decoder之间的多头互注意力模块,再送入前馈神经网络处理。Decoder层也可以堆叠多个,最后送入前馈神经网络FFN进行物体类别预测和边界框的预测。
边栏推荐
- Day26 IP query items
- 【无标题】
- 2022 practice questions and mock examination of the third batch of Guangdong Provincial Safety Officer a certificate (main person in charge)
- About the problem of APP flash back after appium starts the app - (solved)
- 达晨与小米投的凌云光上市:市值153亿 为机器植入眼睛和大脑
- 《ASP.NET Core 6框架揭秘》样章[200页/5章]
- PACP学习笔记三:PCAP方法说明
- What if the xshell evaluation period has expired
- [untitled]
- Per capita Swiss number series, Swiss number 4 generation JS reverse analysis
猜你喜欢
Four functions of opencv
关于 appium 启动 app 后闪退的问题 - (已解决)
飞桨EasyDL实操范例:工业零件划痕自动识别
【无标题】
- Oui. Migration entièrement automatisée de la Sous - base de données des tableaux d'effets sous net
2022a special equipment related management (boiler, pressure vessel and pressure pipeline) simulated examination question bank simulated examination platform operation
Awk of three swordsmen in text processing
飞桨EasyDL实操范例:工业零件划痕自动识别
2022 practice questions and mock examination of the third batch of Guangdong Provincial Safety Officer a certificate (main person in charge)
Leetcode skimming: binary tree 27 (delete nodes in the binary search tree)
随机推荐
[untitled]
File operation command
初学XML
How to apply @transactional transaction annotation to perfection?
The URL modes supported by ThinkPHP include four common modes, pathinfo, rewrite and compatibility modes
Session
[crawler] avoid script detection when using selenium
Cookie and session comparison
- Oui. Migration entièrement automatisée de la Sous - base de données des tableaux d'effets sous net
Talk about four cluster schemes of redis cache, and their advantages and disadvantages
关于 appium 启动 app 后闪退的问题 - (已解决)
共创软硬件协同生态:Graphcore IPU与百度飞桨的“联合提交”亮相MLPerf
飞桨EasyDL实操范例:工业零件划痕自动识别
ORACLE进阶(五)SCHEMA解惑
test
Four functions of opencv
[learn microservice from 0] [01] what is microservice
How does MySQL create, delete, and view indexes?
Lingyunguang of Dachen and Xiaomi investment is listed: the market value is 15.3 billion, and the machine is implanted into the eyes and brain
在字符串中查找id值MySQL