当前位置:网站首页>Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
Thesis reading (59):keyword based diverse image retrieval with variable multiple instance graph
2022-06-28 11:01:00 【Inge】
List of articles
1 summary
1.1 subject
1.2 background
Cross modal Image Retrieval Has recently attracted extensive research attention . In the real world , Keyword based queries issued by users are usually very short , And has a wide range of semantics . therefore , In this user oriented service , Semantic diversity is as important as retrieval accuracy , To improve the user experience . However , Most cross modal image retrieval methods based on single point query embedding have low semantic diversity , However, due to the lack of cross modal understanding, the accuracy of diversified retrieval methods is low .
1.3 Strategy
An end-to-end Variational multiexample graph (Variational multiple instance graph, VMIG):
1) Learn a continuous semantic space To capture different query semantics ;
2) The retrieval task is formulated as a multi example learning problem , Connecting different features across modes .
In particular , Use query guided Variational self encoder (Variational autoencoder, VAE) To model continuous semantic space , Instead of learning single point embedding . then , By means of Sampling in continuous semantic space And applications Long attention Obtain multiple instances of images and queries respectively . thereafter , Build instance diagram To remove noisy instances and align cross modal semantics . Last , Heterogeneous patterns are fused robustly under multiple losses .
1.4 Bib
@article{
Zeng:2022:110,
author = {
Zeng, Yawen and Wang, Yiru and Liao, Dongliang and Li, Gongfu and Huang, Weijie and Xu, Jin and Cao, Da and Man, Hong},
title = {
Keyword-based diverse image retrieval with variational multiple instance graph},
journal = {
{
IEEE} Transactions on Neural Networks and Learning Systems},
pages = {
1--10},
year = {
2022},
doi = {
10.1109/TNNLS.2022.3168431},
url = {
https://ieeexplore.ieee.org/abstract/document/9764824}
}
2 frame
chart 2 It shows VMIG The overall framework of , It consists of three parts :
1) Semantic feature projection : Extract the features of image and query , And project them into their respective semantic spaces ;
2) Cross model diversity generator ; Learn the one to many semantic distribution to generate multiple instances , And build a multi example diagram of cross model . Multiple instances of images and queries are query oriented VAE And long attention gain , The cross model multi example graph is used to explore the semantic relevance within the schema and cross schema alignment ;
3) Semantic space constraints : Multiple losses are used to constrain the cross modal semantic space .

2.1 Semantic feature projection
Make v v v and t t t Represent images and keyword based queries respectively . Given a t t t, Our goal is Ensure relevance and diversity to retrieve appropriate images . In order to learn better characteristics , use first ResNet Extraction of image features f v \mathbf{f}_v fv, And the use of Doc2Vec Get query characteristics f t \mathbf{f}_t ft. These features are then separated Projection To the semantic space :
{ f ~ v = o v ( f v ) f ~ t = o t ( f t ) (1) \tag{1} \left\{ \begin{array}{l} \tilde{\mathbf{f}}_v&=&o_v(\mathbf{f}_v)\\ \tilde{\mathbf{f}}_t&=&o_t(\mathbf{f}_t) \end{array} \right. { f~vf~t==ov(fv)ot(ft)(1) among o v o_v ov and o t o_t ot It is approximated by a fully connected network Projection function .
2.2 Cross model diversity generator
边栏推荐
猜你喜欢
随机推荐
Basic 02: variable, remember the mobile number of the object
Installing MySQL database (CentOS) in Linux source code
Metersphere实现UI自动化元素不可点击(部分遮挡)
时间戳和date转换「建议收藏」
科研丨Web of Science检索技巧
Does flink1.15 support MySQL views? I configured the view name at the table name to save, but the table could not be found. Think
Fastposter v2.8.4 release e-commerce poster generator
静态库的制作和使用
JS基础8
JS基础3
Convert the file URL in the browser to a file stream
Katalon全局变量在TestObject引用
个人买场内基金选择什么证券公司开户好,更安全
元宇宙系统的发展与原理介绍
Debug debugging in katalon
【agora】get 一个 agora_refptr 对象的用法示例
【剑指Offer】49. 丑数
Ruoyi integrated building block report (NICE)
JS foundation 6
JSON module, hashlib, Base64









