当前位置：网站首页>Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph

Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph

2022-07-28 20:06:00 【Libra's mystiy】

background ：
The development of semantic network , Various knowledge maps utilize resource description frameworks in Web Publish on , Between different entities RDF Links can be used to build a large heterogeneous graph , meanwhile , The public accessibility of visual resource sets has developed greatly , In the research process of knowledge map , There is still much room for visual resources . The general knowledge map only focuses on text facts , There is a lack of a complete multimodal knowledge map in academia , This will hinder the future research of multimodal fusion , Thus it is proposed that Richpedia.
Richpedia Through to the wikidata The text entities in are distributed with sufficiently diverse images to provide a comprehensive multimodal knowledge map , Set the visual semantic relationship between image entities according to the hyperlinks and descriptions in Wikipedia .Richpedia You can use the oriented query endpoint in web Visit .Richpedia Inject comprehensive visual relationship resources into the general knowledge map , Built a big 、 High quality multimodal knowledge atlas data set ; A new framework is proposed to construct a multimodal knowledge map , First of all, from the wikidata、wikipedia and search engine Collect entities and images , Filter the image through a unique model , according to wikipedia Hyperlinks and entity descriptions in are allocated between image entities RDF link ; take Richpedia Published as an open resource , Answer richer visual queries , Make multi relationship link prediction .
Richpedia The construction process of can be divided into data collection 、 Image processing and relationship mining
Richpedia data collection ：
Different from the traditional knowledge map , Our goal is to build a multimodal data set containing rich image entities and their relationships , Fill in from the following aspects Richpedia： from wikidata Collect entities in the knowledge map ; from wikipedia Collect some image entities in , And collected KG The relationship between entities and image entities , At the same time, according to the hyperlinks and related descriptions in Wikipedia , Discover the potential relationship between image entities ; Design web crawlers to collect each KG Sufficient image entities related to entities .
According to the definition , We need to collect two types of entities （ Knowledge map entity and image entity ） To generate Richpedia A triple . We need to create international resource identifiers for each entity （wikidata Contains... For each entity IRI, Will these IRI Add to the knowledge map entity ）;
For image entities , Directly from wikipedia Collect images in , stay Richpedia Create matching IRI, Get enough images from open source resources , And filter it , Create for each image entity IRI;
Triples produce ： stay Richpedia Three types of triples are created in ：image of、attribute、relation, Each of them IRI Are unique , Triples can be generated during data collection image of、attribute, about relation Will use wikipedia Hyperlinks and text in to find relationships .
Richpedia The image processing ：
After collecting image entities , Need to process and build high-quality images , Image data comes from open resources , The ideal image entity not only needs the entities in the knowledge map to be highly correlated , Diversity is also needed , At the same time, there are inevitably repetitive image entities , Therefore, it is necessary to adopt a method based on K-means Clustering filtering , adopt VGG-16 The depth neural network extracts the visual feature vector in the image , Choose by the sum of squares of errors K Value , For each image cluster , collect top-20 Image , The image with the highest visual score is selected as the image with the highest ranking , The second image is the image with the largest distance from the first image , The third image is the image with the largest distance from the first two images . After obtaining the image , By calculating some different visual descriptors （ Gray histogram descriptor of image 、 Color layout descriptor 、 Color moment descriptor 、GLCM The descriptor 、 Gradient description histogram ）, By using these descriptors to calculate the similarity between images .
Richpedia Relationship mining ：
Use the relevant links and texts in Wikipedia to find the semantic relationship between image entities , Extract the final relationship through three rules ：
（1） There is a hyperlink in the description , from stanford coreNLP Detect keywords in the description of , Find the relationship through the string mapping algorithm between keywords and predefined relationship ontology , for example ： In the text description between two entities ”left“, Will get ”near by“ Relationship .
（2） There are multiple hyperlinks in the description , Based on parser and syntax tree , Take the core entity as input , Simplify to rules 1.
（3） If there are no hyperlinks , Use Stanford coreNLP lookup wikipedia The relevant attached in the article KG Entity , Simplify the situation into rules 1 And rules 2. The rules 3 Apply to NER result , Its quality is lower than that of annotated hyperlinks , Priority is lower than the first two rules .
The process of building ：
1、 Collection of urban entities , utilize SPARQL Structured query language from wikidata Extract the urban entity , And select the attribute as “ City ” Entity of , Get the names of all entities and their corresponding wikidata identifier . Combine urban entities with wikidata Identifiers are stored in a specific JSON In file , For every city KG Entity , Put other information such as the country of each city 、 Total urban area 、 The total population of the city 、 Time zone, etc , Also stored in the corresponding JSON In file .
2、 Collection of landscape entities , Collect information about city attractions from Ctrip website , Take the urban entity as the starting point , Get every city KG Before the entity 30 Famous landscape city . Crawl location 、 Brief introduction of opening hours and relevant scenic spots .
3、 Collection of celebrity entities , from wikidata Collect a list of celebrity entities , utilize SPARQL The query language selection attribute of structured query statement is “human” To filter entities , Eliminate unqualified celebrities KG Entity , Finally, celebrities were identified KG An entity list , At the same time, some attribute information of the entity is stored in JSON In file .
4、 And KG Collection of corresponding image entities , Choose image search engine Google 、 Yahoo 、 Bing and wikipedia Collect images , The three search engines can complement each other to meet the integrity of the knowledge map ,wikipedia Contained in the KG Image of entity , And a lot of hyperlinks and descriptive information between entities .
5、 Filtering of noisy image entities , Let all KG Entities have as many corresponding KG Entity related image entities , Image clustering algorithm is used to filter out noisy images , Use VGG-16 Extracting structural features of images , The feature vector of image entity is expressed as three-dimensional , choice K-means Algorithm to achieve clustering filtering （ The image has no label information , It is difficult for the popular deep learning network to label images , The training effect of the training set is not good ）
6、 Diversity mining , Filter out too similar image entities , Guarantee Richpedia Image diversity , Select the image entity with the lowest similarity before the month to ensure diversity .
7、 Relationship mining , Extract and infer the potential semantic relationship between image entities through natural language processing technology ,KG The first kind of relationship between entities and image entities is mainly composed of Richpedia File structure establishment in （ The image entity is stored in the entity of the corresponding text structure ）, The second kind relates to the attribute values between image entities and image visual hierarchy information （ Such as height 、 Width etc. ）、 The third kind of relationship is the visual semantic relationship between image entities , Rely on image description and hyperlink information to build .
result ：
Provide online access platform , stay Richpedia Query entity information in , Query the visual semantic relationship between image entities .
summary ：
Multimodal knowledge atlas Richpedia The building process of ： Collect images from the Internet according to the text knowledge map , Filter the image through the diversity retrieval model , Set the relationship between image entities according to the hyperlinks and descriptions in Wikipedia RDF link , Built a big 、 High quality multimodal knowledge atlas data set , Published as an open source , Provides query endpoints .
advantage ：
Build a big , High quality multimodal knowledge atlas , At the same time, the diversity between image entities is considered .
shortcoming ：
Based on the existing knowledge map ,Richpedia Entities in are from wikidata Extraction in （wikidata The entity in is based on wikipedia There are links between entities and descriptive information creation ）

原网站

版权声明
本文为[Libra's mystiy]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/196/202207130614110485.html

当前位置：网站首页>Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph

Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph

边栏推荐

猜你喜欢

随机推荐