当前位置:网站首页>Efficient Video Instance Segmentation via Tracklet Query and Proposal
Efficient Video Instance Segmentation via Tracklet Query and Proposal
2022-07-26 03:19:00 【Early lunar month in Pingqiu】
Abstract
VIS Our goal is to classify at the same time , Division , Track multiple target instances in the video . The present clip-level Of VIS Input a short video , Because the timing context information of multiple frames is used . The effect is obviously better than frame-level VIS. But at present, most clip-level Methods are neither end-to-end learnable , Nor can it be real-time .VIS transformer It solves the above two problems , But because of it frame-wise Dense attention calculation , Training time is too long ; and VisTR Cannot learn from end to end for multiple video segments , Manual data association is required , Before and after clips Examples of weeks tracklet link . In this paper, the EfficientVIS Training reasoning is very efficient , And end-to-end learning . The core idea is “tracklet query and tracklet proposal that associate and segment RoIs across space and time by an interative query-video interaction". And further proposed correspondence Study , Make adjacent clips Of tracklets Links can be learned .
Tracklet Query and Proposal
use tracklet queries { q i } i = 1 N \{q_i\}_{i=1}^N { qi}i=1N and tracklet proposals { b i } i = 1 N \{b_i\}_{i=1}^N { bi}i=1N To jointly represent each object instance in a video .tracklet query q i ∈ R T × C q_i\in R^{T\times C} qi∈RT×C Is the number of channels C Of embedding vector ,tracklet proposal b i ∈ R T × 4 b_i\in R^{T\times 4} bi∈RT×4 It's a space-time Rectangle box .
边栏推荐
- 2020 AF-RCNN: An anchor-free convolutional neural network for multi-categoriesagricultural pest det
- 2022-07-21 study notes of group 4 self-cultivation class (every day)
- dataframe整理:datetime格式分拆;删除特定行;分组整合。
- 论文精读-YOLOv1:You Only Look Once:Unified, Real-Time Object Detection
- NFT is beautiful because it is meaningless
- 班级里有一群学生考试结果出来了,考了语文和数学两门,请筛选出总分是第一的同学
- 如何正确计算 Kubernetes 容器 CPU 使用率
- 了解预加载和懒加载、学会缓动动画
- What is the difference between heap memory and stack memory?
- STM32 - DMA notes
猜你喜欢

js中数组排序的方法有哪些

爆肝出了4W字的Redis面试教程

Leetcode · daily question · sword finger offer | | 115. reconstruction sequence · topological sorting

【无标题】

Swin Transformer【Backbone】

Classic interview questions -- three characteristics of OOP language

Alibaba Sentinel - 集群流量控制

班级里有一群学生考试结果出来了,考了语文和数学两门,请筛选出总分是第一的同学

Unknown-Aware Object Detection:Learning What You Don’t Know from Videos in the Wild(CVPR 2022)

ByteDance (Tiktok) software test monthly salary 23K post, technical two-sided interview questions are newly released
随机推荐
Pit trodden when copying list: shallow copy and deep copy
MPLS基础实验配置
经典面试问题——OOP语言的三大特征
图解LeetCode——5. 最长回文子串(难度:中等)
Unknown-Aware Object Detection:Learning What You Don’t Know from Videos in the Wild(CVPR 2022)
File operation (I) -- File introduction and file opening and closing methods
Small test (I)
使用VRRP技术实现网关设备冗余,附详细配置实验
使用anaconda配置gpu版本的tensorflow(30系列以下显卡)
GoLang 抽奖系统 设计
[Yuri crack man] brings you easy understanding - deep copy and shallow copy
C language layered understanding (C language function)
Execution process behind shell commands
Unity quickly builds urban scenes
QT笔记——Q_Q 和Q_D 学习
snownlp库各功能及用法
Matlab simulation of vertical handover between MTD SCDMA and TD LTE dual networks
[NOIP2001 普及组]装箱问题
Leetcode · daily question · sword finger offer | | 115. reconstruction sequence · topological sorting
Opencv saves pictures in the specified format