当前位置:网站首页>Sfod: passive domain adaptation and upgrade optimization, making the detection model easier to adapt to new data

Sfod: passive domain adaptation and upgrade optimization, making the detection model easier to adapt to new data

2022-06-23 01:03:00 Zhiyuan community

Address of thesis :https://openaccess.thecvf.com/content/CVPR2022/papers/Li_Source-Free_Object_Detection_by_Learning_To_Overlook_Domain_Style_CVPR_2022_paper.pdf


01

Technical brief

Passive target detection (SFOD) It is necessary to adjust the detector pre trained on the marked source domain to the target domain , Only unlabeled training data from the target domain . The existing SFOD Methods usually use the pseudo tag paradigm , Model adaptation alternates between predicting false tags and fine-tuning models .

Due to the existence of domain offset and the limited use of target domain training data , The false label accuracy of this method is not satisfactory . As shown in the figure above .

In today's sharing , The researchers proposed a novel Learning to Overlook Domain Style(LODS)  Method , This approach addresses these limitations in a principled way . The idea of the researchers is Reduce the domain transfer effect by forcing the model to ignore the target domain style , This simplifies model adaptation and makes it easier .

So , Enhanced the style of each target domain image , The style difference between the original image and the enhanced image is used as the self-monitoring signal for model adaptation . By treating the enhanced image as an auxiliary view , Use the student teacher structure to learn how to ignore the style difference from the original image , It is also characterized by a novel style enhancement algorithm and graph alignment constraints . A lot of experiments show that LODS New state-of-the-art performance has been produced in four benchmarks .

 

02

Background Overview

at present , About SFOD There is not much research on the problem . The community pays more attention to Passive domain adaptation (SFDA).SFDA The methods of can be roughly divided into two categories . 

The first is based on the idea of sample generation :

Because the source data is not accessible , Traditional domain adaptation techniques are not applicable . Generate a marker image with source domain style or target domain style , Or a marker characteristic that obeys the source distribution . The key to success is satisfactory sample generation , This in itself is challenging enough and has not been well resolved . 

The other one uses pseudo tags based on self training :

It is not easy to get reliable labels , Especially when there is a big gap in the field , In the process of self-training, only label samples with high confidence are taken .

Obviously , Target domain style ( For example, imaging features ) It makes a great contribution to the migration of the domain relative to the source domain . therefore , Minimizing the impact of the target domain style on the behavior of the model will immediately and effectively reduce domain migration . On the basis of the above , As shown in the figure at the beginning , A new domain adaptation method is proposed (LODS). 

It first enhances the target domain style of each target image , While maintaining the original style of the target image . In this way, the auxiliary view based on style enhanced image is constructed . With this auxiliary view , The new method makes the target detector learn to ignore the target domain style . Student - The teacher framework is used to accomplish this task .

 

03

New framework analysis

Proposed LODS The method consists of two parts . As shown in the figure below , One is style enhancement module; The other is overlooking style module.

style enhancement module( Upper figure (a)) First extract the style of each image , Channel mean and variance . For an image , The enhanced target domain style is calculated as a nonlinear combination of itself and the style of any target image . then , Enhance the style by replacing the enhanced style . By looking at style enhanced images as another area , have access to Mean-Teacher Framework to take advantage of style differences for model adaptation ( chart (b)). The target image and the style enhanced version are input into the teacher and student models respectively . These two models are based on Faster-RCNN And initialize it as the source model of pre training . Class instance level alignment and image level alignment based on graph matching are designed to help teachers and students learn from each other . Pseudo tags are also used to increase the discrimination of student models .

Based on the above inference , The researcher designed a style enhancement module , As shown in the figure above . Two networks F1 and F2 Respectively designed to approximate δ1 and δ2. Each of them consists of two fully connected layers and one ReLU layers , Having the smallest parameter is nonlinear . Feature coder E From pre training VGG-16 Model , And fixed during training and testing . decoder D Is the inverse of the encoder . Because style consistency is limited by the underlying features , Encoder E = E2 ◦E1 Further divided into E1 and E2 part , among ◦ Is a function nesting operator . decoder D = D2 ◦ D1 As D1 and D2 So it is with . say concretely , The first after the first down sampling ReLU Layer is split E Dividing line .D Divided symmetrically into E.

原网站

版权声明
本文为[Zhiyuan community]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/174/202206230033184486.html