当前位置:网站首页>Graph Attention Tracking

Graph Attention Tracking

2022-06-11 06:51:00 A Xuan is going to graduate~

1. Abstract :

        The tracker based on twin network describes the visual tracking task as a similarity matching problem . Almost all popular Siamese The trackers all pass between the target branch and the search branch Convolution feature cross correlation To achieve similarity learning . however , Due to the need to determine the size of the target feature area in advance , these Cross correlation basis method Or a lot of unfavorable background information is retained , Or lost a lot of foreground information . Besides , The global matching between the target and the search area also ignores the target structure and partial information to a great extent (structure and part-level information).

        For the above problems , This paper presents a simple target perception Siamese Fig. notice the network , For general target tracking . A complete bipartite graph is proposed ( complete bipartite graph) Set up the part of the target and search area - Partial correspondence , The graph attention mechanism is used to propagate the target information from the template feature to the search feature . Besides , We studied a kind of Target aware region selection mechanism , To adapt to the size and aspect ratio of different objects , Instead of using the pre fixed region clipping to select the template feature region . Include GOT-10k、UAV123、OTB-100 and LaSOT Experiments carried out on a challenging benchmark including , Proposed SiamGAT Better than many advanced trackers , Leading edge performance . The code can be found in :https: // git.io / SiamGAT.

2. Problem presentation

(1) Raise questions :

        How to embed the information of two branches to obtain Response graph It's a key issue , Because the information transferred from the template to the search area is very important for the precise positioning of the object .

(2) At present, there are some problems in cross-correlation method :

      1) The size of convolution kernel is preprocessed , The common method is to cut the center of the template feature map mxm Region , Generate target features , As a convolution kernel . However , When solving tracking tasks with different target scales or different aspect ratios , This The pre fixed feature area may have the problem of retaining a large amount of background information or losing a large amount of foreground information , This leads to inaccurate information embedding .   

       2) The similarity between the target feature and the search area is calculated as a whole . However , stay In the process of target tracking, large rotation often occurs 、 Posture changes and severe occlusion , Global matching of variable targets is not robust .
       3) because 2), The information embedding between the template and the search area is a global information dissemination process , The information that the template conveys to the search area is limited , Too much information compression . Our main observations are as follows :1) Information embedding should be Purposeful , That is to say, the size and aspect ratio of the target change adaptively during the tracking process .2) Information should be embedded by learning part level relationships (part-level relations??? What do you mean by that? —— Information embedding should use parts to match relationships , Instead of matching globally ?)( Instead of global matching ) To achieve , Because component features are invariant to shape and attitude changes , So it has stronger robustness .

(3) The proposed solution :

      In order to solve the above problems , utilize Picture attention network Designed a part-to-part information embedding network For target tracking .

3. The main contribution of this paper is :

(1) We propose a graph attention module (GAM) To achieve part-to-part A match between , To achieve information embedding . Compared with the traditional method based on cross-correlation , This method greatly eliminates their shortcomings , Effectively transfer the target information from the template to the search area .

(2) This paper proposes a target aware method Siamese Graph Attention Tracking (SiamGA T) The Internet , The network and GAM For general target tracking . The framework is simple and effective . Compared with the previous work using pre fixed global feature matching , The model can adapt to the size and aspect ratio of different objects .

原网站

版权声明
本文为[A Xuan is going to graduate~]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/03/202203020525262294.html