当前位置:网站首页>Image retrieval method based on deep learning!

Image retrieval method based on deep learning!

2022-08-02 13:42:00 InfoQ

A, summary: < br>< h2>< / h2>< div> organic in text retrieval and image retrieval based on content retrieval, if is based on the semantic retrieval, before retrieve the attributes of the need for huge amounts of image semantic annotation, the annotation has subjective deviation, time cost is very high, and semantic properties cannot be fully expressed in the image contains abundant information, retrieval effect is limited.Content-based retrieval (CBIR) technology, "in an effort to SouTu" has his unique advantages.(taobao, jingdong, baidu, Google has support in the form of figure SouTu) < / div>< div> 2, based on the deep learning method < / div>< div> based on image feature extraction, obtain said, similarity measure, measure to study, according to the similarity sorting, retrieval results are obtained.< / div>< div> (1) the same object image retrieval < / div>< div> & amp;nbsp;&nbsp;&nbsp;&nbsp;The same object image retrieval "refers to an object of the query image, from the image library to find contains the object's image.Users interested in here is that the images contained in a particular object or target, and the retrieved images should be included the pictures of the object itself.As shown in figure 1.3, given a portrait of "Mona Lisa", the goal is to the same object retrieval from the image database retrieval out those containing the "Mona Lisa" images of the characters, after a similarity measure sorting these contain "Mona Lisa" the characters' images as much as possible in the front of the search results.Similar object retrieval in the English literature generally called retrieval (Object Retrieval), search or approximate sample detection (Duplicate Search Or Detection) can also be classified as the same object retrieval, and the same object retrieval method can be applied directly to approximate search or test specimens.To retrieve the same object whether in study or in business image search industry is of great value, such as application of shopping search clothes shoes, face retrieval, etc.< / div>< div> for image retrieval, the same object, the retrieval of the same object or target vulnerable to shoot the environment, such as illumination change, scale changes, the change of perspective, shelter and background clutter, etc will be larger effects on the retrieval results, figure 1.3 left gives the example of these changes, in addition, for the non-rigid objects, when retrieved, deformation of the object to retrieve the results have a big impact.< / div>< div> (2) the same category image retrieval < / div>< div> similar image retrieval targets from the image library to find out those images with a given query image belong to the same category.Here the user is interested in the category of the object, or scene, namely the user wants to get are those who have the same category attributes of the object or scene pictures.In order to distinguish the same object retrieval and retrieval of these two kinds of the same category retrieval way, still in 1.3 at left for the "Mona Lisa", for example, if the user is interested in "Mona Lisa" this painting, the retrieval system should be the same as the way of working is retrieved object retrieval way, but if the user is not interested in "Mona Lisa" the painting itself, but rather "portrait" this kind of picture, that is, users are interested in is already on the concrete painting category concept of abstraction, so the retrieval system retrieval way retrieval should be in the same category.The same category image retrieval at present has been widely used in image search engine, medical image retrieval, etc.< / div>< div> image retrieval for the same category, the main problem is to belong to the same category of images a dramatic change in the class, rather than the small difference between the images of the same class.As shown in figure 1.3 the picture on the right, for "lake", this kind of image belongs to the category of image there are great differences in the performance form, for the right shown below in figure 1.3 the "dog" & have spentClasses and class "woman" both images, although they belong to different classes, but if use low-level features to describe, such as color, texture and shape features, such as the difference between classes is very small, direct use of these characteristics is difficult to separate the two, so on the characteristics of image retrieval in the same category description within the larger class changes and challenges such as smaller difference between classes.< / div>< img SRC="/ / img.inotgo.com/imagesLocal/202208/02/202208021323163833_0.png" Alt="" loading=" lazy ">< br>< div> 3, improve retrieval performance of some of the thinking < / div>< div> 1, & have spentSearch target background clutter < / div>< div> (1) in the case retrieval, complex background noise directly influences the final search performance.So many team the first attempt to use target detection, such as faster - - RCNN) the region of interest RPN positioning, and then further study characteristics, compare the similarity.In addition, when there is no bounding Box when the training data, weak supervision and target location is also a kind of effective method.< / div>< div> (2) preprocessing: automatic positioning users interested in goods, remove the background, the main body, the influence of such factors as more also is helpful to extract semantic features of alignment.Semantic alignment, common operations are goods detect partial key frame alignment, rotational alignment, alignment, etc.(taobao SouTu users can manually adjust the selection test box) < / div>< div> 2, & have spentClass differences and similarities between class (high-level semantics and the low-level features fusion) < / div>< div> many method is done with the last convolution layer or whole connection characteristics of retrieval, and due to the characteristics of the high-rise has lost a lot of details (for deeper network, the loss is more serious.High-level semantics and the low-level feature fusion is very important.Different layers of feature maps (feature Map) for fusion, which not only takes advantage of the characteristics of high-level semantic information, and also considered the details of the low-level features texture information, makes the search for more accurate.GoogLeNet - 22 network, for the last eight layer feature maps (from Inception 3 b to Inception 5 b), the first to use the largest pool of these different scale maps separately sub sampling (converted to the characteristics of the same size chart), and use the convolution of the sampling results further processing.And characteristics of these figure do linear weighted (done by the convolution), finally, on this basis, use the sum Pooling to get the final image characteristics.In training, we according to the training data provided by optimizing triplet  based on cosine distance;Ranking Loss to the end-to-end learning these characteristics.So in the test, you can directly use characteristics between the cosine distance to measure the similarity of images.< / div>< div> (such as a garment is in addition to a round collar is another v-neck, other colors, textures are identical, collar shape is high-level semantics, color texture is low-level features.If you want to achieve a good retrieval effect best consideration.(similar to characteristics of the pyramid FPN)) < / div>< div> 3, & have spentFeature dimension reduction < / div>< div> feature extraction is said is often contains a lot of the characteristics of a group of different weight vector, the high latitude high latitudes to the back of the analysis of inconvenience, component may relate to each other, between the need of feature dimension reduction.With good sex of low latitude and discriminant features to ensure that the performance and efficiency of retrieval.Used for dimension reduction of learning data generally is commodity with data;Common dimension reduction method with linear discriminant analysis (LDA), image classification and measure learning (unsupervised markov measures, PCA principal component analysis feature extraction: to go to the mean, & have spentCalculating the average of all the data, & have spentAnd to put?A data minus the average;Calculation of covariance matrix S; Calculate S The eigenvalue and eigenvector of & have spentAnd in descending order eigenvalue size;Choose K  before;(K after the dimension reduction of dimension) as a characteristic value of characteristics to the corresponding transformation matrix; Using the transformation matrix to transform) of original data such as < / div>< div> 4, & have spentSpeed < / div>< div> (1) & have spentCharacteristics of clustering, if the database data quantity is small, is used to retrieve the smaller can also use the characteristic dimension of exhaustive method direct retrieval.But if the data is very much, feature dimension high, this method is very slow, can use the clustering narrow your search.K - means Clustering (choose k point as the initial center of mass, each point will be assigned to the center of mass, the recent form k clusters, recount each cluster centroid, repeatedly, until the cluster does not change, or to the largest number of iterations.Advantages of easy to implement, faults may converge to local minimum, large-scale data convergence slow).Identify and image to search the smallest distance clustering center, after calculating the image to search and the clustering center of the cluster each image in the distance, return some distance from the smallest as a result.< / div>< div> (2) & have spentBy identifying the target of category, lets the when retrieving data of the class library search and improve the effect and efficiency of retrieval.< / div>< div> 5, & have spentThe image related text and image characteristics of the underlying into a CBIR system.(baidu, taobao can input images after first, then input text) < / div>< div> 6, & have spentFeedback technology.Image retrieval the end user is a person, by means of interaction to capture people's perception of image content.Reflects the people and systems work together, scoring, online learning, retrieval performance evaluation index.(reinforcement learning, a reward for the search results) < / div>< div> 7, & have spentFirst-order pooling feature and second-order pooling feature fusion < / div>< div> second-order pooling method by capturing images second-order statistical variables, such as covariance, often can get better search accuracy.< / div>< div> 8, combined characteristics of learning and properties prediction < / div>< div> like DeepFashion, at the same time learning characteristics and predict the attribute of image (multitasking training), thus to get the more distinct features.(for Softmax  loss function;Loss And measure learning & have spentLoss additive) < / div>
原网站

版权声明
本文为[InfoQ]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/214/202208021323163833.html