当前位置：网站首页>Raki's notes on reading paper: neighborhood matching network for entity alignment

Raki's notes on reading paper: neighborhood matching network for entity alignment

2022-06-30 02:39:00 【Sleepy Raki】

Abstract & Introduction & Related Work

If you are not a knowledge map player , Please read the popular science of entity alignment first ： Introduction to entity alignment

Research tasks
Entity aligned
Existing methods and related work
1. embedding-based methods
Facing the challenge
1. Entity alignment is not easy , Because the knowledge map in real life is usually incomplete , And different knowledge maps usually have heterogeneous patterns . therefore , Equivalent entities from two knowledge maps may have different surface forms or different adjacency structures .
2. The problem of different adjacency relations between equal entities is ubiquitous
3. The difference of neighborhood size and topology brings great challenges to entity alignment methods
Innovative ideas
1. The map sampling method extracts a neighborhood with identifiability for each entity
2. Cross graph neighborhood matching module , Joint coding of neighborhood differences for a given entity pair
The experimental conclusion
sota

Our Approach

The two graphs are based on a pre aligned set of peer entities , The goal is to find the peer entity pairs in the two graphs
Insert picture description here

Overview of NMN

NMN The first use of GCNs Modeling the neighborhood topology information
Neighborhood sampling is used to select neighbors with more information
It uses a cross map matching module to capture the differences between neighbors

KG Structure Embedding

First use GCN To aggregate higher-order entity neighbor structure information , Use pre training word embedding to initialize GCN

Enter two graphs as a large graph into GCN Inside , Every GCN The layer takes a set of node features as input , And update the node representation to ：
Insert picture description here
$h_i^{l}$ It's No L Node output characteristics of layer

In order to control the accumulated noise , stay GCN A high-speed network is used between layers to effectively control noise propagation
Insert picture description here

Neighborhood Sampling

One hop neighbor of an entity is the key to determine whether the entity should be aligned with other entities , But not all one hop neighbors contribute positively to entity alignment , Therefore, a down sampling process is used to select a neighbor with the largest amount of Central entity information

GCN The learned entities are embedded with rich context information ,for Neighborhood structure and entity semantics , The more context sensitive the central entity, the easier it is to sample

Insert picture description here

In essence, a discriminant neighborhood subgraph is constructed for each entity , This can achieve more accurate alignment through neighborhood matching
Insert picture description here

Neighborhood Matching

The neighborhood subgraph generated by the sampling process determines which neighborhoods of the target entity should be considered in the later stage . let me put it another way ,NMN Handle pipeline The later stage of the will only operate on the neighbors in the subgraph . In the neighborhood matching stage , We want to be the corresponding KG For each candidate entity, find out which neighborhood of the entity is closely related to a neighborhood node in the subgraph of the target entity . This information is important for determining two entities （ From two KG） Whether it should be aligned is crucial
Insert picture description here

Candidate selection

In order to reduce the computational overhead ,NMN First of all to $E_1$ Medium $e_i$ Sample an alignment candidate set $C_i = \{c_{i1}, c_{i2}, ..., c_{it} |c_{ik}∈E_2\}$ , And then calculate $e_i$ And the subgraph similarity of these candidate sets . This is based on $E_2$ In the embedded space $e_i$ Closer entities are more likely to be associated with $e_i$ Aligned observations . therefore , about $E_2$ One of the entities in $e_j$ , It is sampled as $e_i$ The probability of a candidate can be calculated as
Insert picture description here

Cross-graph neighborhood matching

p and q They are given E1 and E2 The neighbors of the nodes in the graph , Calculate an attention
Insert picture description here
And then put them together

For each target neighbor in the neighborhood subgraph , The attention mechanism in the matching module can accurately detect another KG Which neighbor in the subgraph of is most likely to match the target neighbor . Intuitively speaking , Matching vectors mp Captured the difference between the two closest neighbors . When the representations of two neighbors are similar , Matching vectors tend to be zero vectors , So their characterization remains similar . When the representations of neighbors are different , The matching vector will be amplified by propagation

Neighborhood Aggregation

Insert picture description here
My question is , Can you explain each parameter clearly ？？？ $W_{gate}$ What did you say ？

Experimental Setup

Insert picture description here

Experimental Results

Insert picture description here

Conclusion

NMN It's solved KG The ubiquitous neighborhood heterogeneity problem in . We achieve this goal by using a new sampling based method to select the most informative neighbor for each entity

NMN By considering topological structure and neighborhood similarity , Estimate the similarity of two entities at the same time . We have conducted extensive experiments on real-world data sets , And will NMN With the latest 12 Two embedding based methods are compared . Experimental results show that ,NMN Got the best 、 More robust performance , In different data sets and evaluation indicators, it is better than the competitive method