当前位置:网站首页>Sfod: passive domain adaptation and upgrade optimization, making the detection model easier to adapt to new data
Sfod: passive domain adaptation and upgrade optimization, making the detection model easier to adapt to new data
2022-06-23 01:03:00 【Zhiyuan community】

Address of thesis :https://openaccess.thecvf.com/content/CVPR2022/papers/Li_Source-Free_Object_Detection_by_Learning_To_Overlook_Domain_Style_CVPR_2022_paper.pdf
01
Technical brief
Passive target detection (SFOD) It is necessary to adjust the detector pre trained on the marked source domain to the target domain , Only unlabeled training data from the target domain . The existing SFOD Methods usually use the pseudo tag paradigm , Model adaptation alternates between predicting false tags and fine-tuning models .

Due to the existence of domain offset and the limited use of target domain training data , The false label accuracy of this method is not satisfactory . As shown in the figure above .
In today's sharing , The researchers proposed a novel Learning to Overlook Domain Style(LODS) Method , This approach addresses these limitations in a principled way . The idea of the researchers is Reduce the domain transfer effect by forcing the model to ignore the target domain style , This simplifies model adaptation and makes it easier .

So , Enhanced the style of each target domain image , The style difference between the original image and the enhanced image is used as the self-monitoring signal for model adaptation . By treating the enhanced image as an auxiliary view , Use the student teacher structure to learn how to ignore the style difference from the original image , It is also characterized by a novel style enhancement algorithm and graph alignment constraints . A lot of experiments show that LODS New state-of-the-art performance has been produced in four benchmarks .
02
Background Overview
at present , About SFOD There is not much research on the problem . The community pays more attention to Passive domain adaptation (SFDA).SFDA The methods of can be roughly divided into two categories .
The first is based on the idea of sample generation :
Because the source data is not accessible , Traditional domain adaptation techniques are not applicable . Generate a marker image with source domain style or target domain style , Or a marker characteristic that obeys the source distribution . The key to success is satisfactory sample generation , This in itself is challenging enough and has not been well resolved .
The other one uses pseudo tags based on self training :
It is not easy to get reliable labels , Especially when there is a big gap in the field , In the process of self-training, only label samples with high confidence are taken .

Obviously , Target domain style ( For example, imaging features ) It makes a great contribution to the migration of the domain relative to the source domain . therefore , Minimizing the impact of the target domain style on the behavior of the model will immediately and effectively reduce domain migration . On the basis of the above , As shown in the figure at the beginning , A new domain adaptation method is proposed (LODS).
It first enhances the target domain style of each target image , While maintaining the original style of the target image . In this way, the auxiliary view based on style enhanced image is constructed . With this auxiliary view , The new method makes the target detector learn to ignore the target domain style . Student - The teacher framework is used to accomplish this task .
03
New framework analysis
Proposed LODS The method consists of two parts . As shown in the figure below , One is style enhancement module; The other is overlooking style module.

style enhancement module( Upper figure (a)) First extract the style of each image , Channel mean and variance . For an image , The enhanced target domain style is calculated as a nonlinear combination of itself and the style of any target image . then , Enhance the style by replacing the enhanced style . By looking at style enhanced images as another area , have access to Mean-Teacher Framework to take advantage of style differences for model adaptation ( chart (b)). The target image and the style enhanced version are input into the teacher and student models respectively . These two models are based on Faster-RCNN And initialize it as the source model of pre training . Class instance level alignment and image level alignment based on graph matching are designed to help teachers and students learn from each other . Pseudo tags are also used to increase the discrimination of student models .

Based on the above inference , The researcher designed a style enhancement module , As shown in the figure above . Two networks F1 and F2 Respectively designed to approximate δ1 and δ2. Each of them consists of two fully connected layers and one ReLU layers , Having the smallest parameter is nonlinear . Feature coder E From pre training VGG-16 Model , And fixed during training and testing . decoder D Is the inverse of the encoder . Because style consistency is limited by the underlying features , Encoder E = E2 ◦E1 Further divided into E1 and E2 part , among ◦ Is a function nesting operator . decoder D = D2 ◦ D1 As D1 and D2 So it is with . say concretely , The first after the first down sampling ReLU Layer is split E Dividing line .D Divided symmetrically into E.
边栏推荐
- Shell view help
- How to set the power-off auto start of easycvr hardware box
- Shell logs and printouts
- 62. 不同路径
- Prevent others from using the browser to debug
- Does qiniu school belong to a securities company? Is it safe to open an account?
- Daily question brushing record (I)
- a++,++a,!,~
- Ansible learning summary (7) -- ansible state management related knowledge summary
- MySQL-Seconds_ behind_ Master accuracy error
猜你喜欢

Quelle est la structure et la façon dont les données sont stockées dans la base de données?

贵金属现货白银如何呢?

【机器学习-西瓜书】更文挑战【Day1】:1.1 引言

SAP mm me27 create intra company sto order

cadence SPB17.4 - allegro - 優化指定單條電氣線折線連接角度 - 折線轉圓弧

Binary tree to string and string to binary tree

New paradigm of semantic segmentation! Structtoken: Rethinking the per pixel classification paradigm

SAP ui5 application development tutorial 102 - detailed trial version of print function implementation of SAP ui5 application
![[launch] redis Series 2: data persistence to improve availability](/img/f4/5bc7ca3e17c6656e71df515182842e.png)
[launch] redis Series 2: data persistence to improve availability

数据库中数据的储存结构和方式是什么?
随机推荐
初学者如何快速入门深度学习?
How Huawei cloud implements a global low delay network architecture for real-time audio and video [Part 1]
[initial launch] there are too many requests at once, and the database is in danger
Tidb monitoring upgrade: a long way to solve panic
JS to determine whether the browser has opened the console
JS image resolution compression
EasyCVR硬件盒子如何设置断电自启动
How about precious metal spot silver?
Cadence spb17.4 - Allegro - optimiser la spécification de l'angle de connexion de la polyligne pour une seule ligne électrique - polyligne à arc
How do beginners get started quickly and learn deeply?
Requête linq
Introduction to the use of opencvsharp (C openCV) wechat QRcode decoding function (with source code attached)
How to get started with machine learning?
cadence SPB17.4 - 中文UI设置
Is it safe to invest in funds through daily funds? I intend to open an account to buy funds
TiDB VS MySQL
a++,++a,!,~
How to solve the problem that easycvr does not display the interface when RTMP streaming is used?
Which platform is safer to buy stocks on?
关于测试/开发程序员技术的一些思考,水平很高超的,混不下去了......