当前位置:网站首页>Asgnet paper and code interpretation 2
Asgnet paper and code interpretation 2
2022-07-01 03:29:00 【It's seventh uncle】
Address of thesis :Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
Paper code :ASGNet
Abstract
Prototype learning is widely used in small sample segmentation . Usually , By averaging the global object information , From supporting features (support feature) Get a single prototype in . However , Using a prototype to represent all the information may lead to ambiguity . In this paper , We propose two new modules : Super pixel guided clustering (SGC) And guided prototype allocation (GPA), Used for extraction and allocation of various prototypes . say concretely ,SGC It is a method without parameters and training , It extracts more representative prototypes by aggregating similar feature vectors , and GPA Be able to select matching prototypes to provide more accurate guidance . By way of SGC and GPA Bind together , We propose an adaptive super-pixel guidance network (ASGNet), This is a lightweight model , Able to adapt to changes in object size and shape . Besides , Our network can be easily extended to k-shot Division , There are significant improvements and no additional computational costs . special , We use COCO Data set evaluation shows ,ASGNet stay 5-shot The accuracy of segmentation is higher than that of the most advanced methods 5%.
Existing problems and solutions Introduction
current Few-Shot Segmentation networks usually extract features from query images and support images , Then different feature matching methods and target mask transmission methods from support image to query image are proposed . Feature matching and mask passing usually use prototype feature learning technology . Prototype learning technology will support the mask target object of image (masked object features) Compressed into one or several prototype eigenvectors . then , Search the pixel position of similar features in the query image to segment the target .
- One of the main advantages of prototype learning is that prototype features are more robust to noise than pixel features . However ,
Prototype features inevitably lose spatial information, This is very important when the appearance of objects that support images and query images are quite different . Besides , Most prototype learning networks only generate a single prototype through mask average pooling, thus losing the ability to distinguish information .
In this work , We propose a new prototype learning technology , To address some of the existing major shortcomings . especially , We want to adaptively change the number of prototypes and their spatial range according to the image content , Make the prototype have the ability of content adaptation and spatial awareness . This adaptive multi prototype strategy is very important to deal with the huge changes of object size and shape in different images . Intuitively , When an object occupies a large part of the image , It carries more information , So more prototypes are needed to represent all the necessary information . contrary , If the object is small , The proportion of background is relatively large , Then one or more prototypes are enough . Besides , We want the support area of each prototype ( The scope of space ) It can adapt to the object information appearing in the supporting image . say concretely , Our goal is to divide the supporting features into several representative regions according to feature similarity . We also hope to be able to adaptively select more important prototypes to find more similar features in query images . Different object parts may appear in different image regions and different query images , Therefore, we hope to dynamically allocate different prototypes in the query image for feature matching . for example , Some parts of the object may be occluded in the query image , We want to dynamically select the prototype corresponding to the visible part of the query image .
We use adaptive super pixels to guide the network (ASGNet) To achieve this adaptation 、 Multi archetypal learning and distribution ,ASGNet Use super pixels to adapt to the number of prototypes and support areas . Specially , We propose the composition ASGNet Two modules of the core : Super pixel guided clustering (SGC) And guided prototype allocation (GPA).
- SGC The module carries out feature-based super-pixel fast extraction for supporting images , Got super image plain quality heart do by primary type , sign \color{red}{ Super pixel centroid as prototype feature } super image plain quality heart do by primary type , sign . The shape and number of super pixels are adaptive to the image content , Therefore, the generated prototype also becomes adaptive .
- GPA The module uses a mechanism similar to attention to branch with Big many Count phase Turn off Of the a primary type , sign \color{red}{ Assign most relevant supporting prototype features } branch with Big many Count phase Turn off Of the a primary type , sign .
in summary ,SGC The module provides adaptive prototype learning in terms of the number of prototypes and their spatial expansion ,GPA The module provides adaptive allocation of learned prototypes when dealing with query features . These two modules make ASGNet Highly flexible and adaptable to variable object shapes and sizes , Allow it to better generalize invisible object classes .
Proposed Method
In this part , We first introduce two prototype generation and matching modules , That is, the super-pixel guided clustering module (SGC) And guide prototype allocation module (GPA). then , We discuss the adaptive ability of these two modules . then , We introduced the whole network architecture , It is called adaptive super pixel guidance network (ASGNet), It will SGC and GPA Modules are integrated in one model . The overall structure is shown in the figure 2 Shown . Last , We explained ASGNet Medium k-shot Set up .
Superpixel-guided Clustering( Super pixel guided clustering )
SGC The core idea of super pixel sampling network (SSN)[13] and MaskSLIC[12] Inspired by the .SSN Is the first end-to-end trainable depth network for super pixel segmentation .SSN The key contribution of is to SLIC[1] The nearest neighbor operation in is transformed into differentiable operation . Conventional SLIC The super-pixel algorithm uses k Mean iterative clustering , In two steps : Pixel super pixel Association and super pixel centroid update . Based on color similarity and proximity , Assign pixels to different superpixel centroids . To be specific , The input image I∈ Rn×5 Usually located with n A five-dimensional space of pixels (labxy), among lab Express CIELAB Pixel vector in color space ,xy Indicates the pixel position . After iterative clustering , The algorithm outputs the correlation graph , Each of these pixels n Assigned to m One of the super pixels .
This simple method inspired us with a profound idea , That is, the feature map is aggregated into multiple super pixel centroids by clustering , Here the super pixel centroid can be used as a prototype . therefore , We do not calculate the super pixel centroid in the image space , Instead, it is estimated by clustering similar feature vectors , Classify in feature space . Algorithm 1 Describe the whole SGC The process :
边栏推荐
- [深度学习]激活函数(Sigmoid等)、前向传播、反向传播和梯度优化;optimizer.zero_grad(), loss.backward(), optimizer.step()的作用及原理
- The shell script uses two bars to receive external parameters
- Include() of array
- ctfshow爆破wp
- 衡量两个向量相似度的方法:余弦相似度、pytorch 求余弦相似度:torch.nn.CosineSimilarity(dim=1, eps=1e-08)
- Depth first traversal of C implementation Diagram -- non recursive code
- EtherCAT简介
- 5、【WebGIS实战】软件操作篇——服务发布及权限管理
- Introduction to webrtc concept -- an article on understanding source, track, sink and mediastream
- 多元线性回归
猜你喜欢

Listener listener

Redis tutorial

5、【WebGIS实战】软件操作篇——服务发布及权限管理

C # realize solving the shortest path of unauthorized graph based on breadth first BFS -- complete program display

LeetCode 31下一个排列、LeetCode 64最小路径和、LeetCode 62不同路径、LeetCode 78子集、LeetCode 33搜索旋转排序数组(修改二分法)

Detailed list of errors related to twincat3 ads of Beifu

ctfshow爆破wp

雪崩问题以及sentinel的使用

Learning notes for introduction to C language multithreaded programming

IPv4和IPv6、局域网和广域网、网关、公网IP和私有IP、IP地址、子网掩码、网段、网络号、主机号、网络地址、主机地址以及ip段/数字-如192.168.0.1/24是什么意思?
随机推荐
Detailed list of errors related to twincat3 ads of Beifu
【日常训练】1175. 质数排列
pytest-fixture
Cookie&Session
leetcode 1482 猜猜看啊,这道题目怎么二分?
Golang multi graph generation gif
后台系统页面左边菜单按钮和右边内容的处理,后台系统页面出现双滚动
Valid brackets (force deduction 20)
LeetCode 128最长连续序列(哈希set)
家居网购项目
Kmeans
ECMAScript 6.0
gcc使用、Makefile总结
服务器渲染技术jsp
How to verify whether the contents of two files are the same
JS daily development tips (continuous update)
Analyze datahub, a new generation metadata platform of 4.7K star
The shell script uses two bars to receive external parameters
pytorch nn.AdaptiveAvgPool2d(1)
CX5120控制汇川IS620N伺服报错E15解决方案