当前位置：网站首页>Figure overview of neural network

Figure overview of neural network

2022-06-11 23:00:00 【Hua Weiyun】

GNN

What is a graph?

Common diagram structure

spot , edge

How to represent a picture as a graph structure

Pixels are nodes
The relationship between pixels , For edge
And the surrounding pixels

How text is represented as a graph structure

Word as node
Before and after , For edge （ Relationship ）

molecular

Chemical molecules
- Prediction of molecular properties
- Molecular generation
- chemical reaction
Protein networks
- Interaction , medicine - protein

Social networks

The citation structure of the article

Properties of graph data

Uneven distribution of nodes
Additional properties of edges
Permutation invariance

The development history

The goal is

Through artificial neural network , Map the graph and some points on the graph to a low dimensional space

Spectral domain neural network

In order to solve the irregularity of spatial neighborhood , A spectral network on a graph is proposed
The Laplace matrix of a graph , Do spectral decomposition
- The convolution operation is carried out by using the obtained eigenvalues and eigenvectors
The convolution kernel is defined as polynomial form by Chebyshev network , Calculate approximate convolution kernel , Increase of efficiency
- Use only the convolution kernel of the first-order approximation
- Realize fast localization and low complexity computation
shortcoming
- High computational complexity
  - You need to calculate the Laplace matrix , Get eigenvalues and eigenvectors
  - You need to save the whole picture into , It consumes more memory
- Cannot extend to more graphs
  - Because the convolution kernel of a graph depends on the Laplace matrix of each graph , There is no way to extend to other graphs
    - Parameters cannot be shared on different graphs
    - The basis of convolution calculation is different

Spatial neural network

GGNN( Gate graph neural network ）
MPNN( Message passing neural networks ）,2017
- GraphSAGE, 2017
  - GIN ( Graph isomorphic neural network ）, 2019
  - Definition ： From direct push learning to inductive learning
- GAT, 2018
  - The attention mechanism is used to define graph convolution

Mission

Graph level

Classification of graphs 、 Generation and matching, etc
Figure neural network as encoder Graph2Seq

Node level

classification , Regression and clustering

Edge Level

Link prediction , Classification of edges

The information on the picture

nodes

Vertex properties

edges

Properties of edge

global-context

Global information

connectivity

GNN(Graph Neural Networks)

Transform attributes , Do not change the structure of the graph

It was used 3 All connection layers

Passing messages between parts of the graph

First get the vector of the point , Then add up the neighbor vectors （ Convolution of similar pictures ）
And then put it in MLP in

experiment

Optional parameters
- layers
  - 2-4
- aggregation Methods
  - sum
  - mean
  - max
- embedding Method
  - size
- How to transfer information
- Evaluation indicators
  - AUC
The method of messaging

Add global information

master node or context vector
- Connected to all vertices

GraphSAGE Inductive graph shows learning

Main idea

First, the information of neighbor nodes , Use an aggregate function aggregate, Come together
And then with the node itself , Consolidation and status updates
Use the embedded vectors of all the nodes , As input to downstream tasks

The main point is

SAMPLE
- Facilitate batch processing
  - Take out first K Next neighbor nodes
    - Usually take K=2
- Reduce computational complexity
  - Resample a fixed number of neighbor nodes
AggreGATE
- w Parameter in ,concat after , And concat Multiply
- Mean aggregation
- LSTM
  - More expressive than mean aggregation , But the army said
  - Randomly disrupt and regroup adjacent nodes
- Pool polymerization
  - Let all nodes pass through a full connection layer
  - Then maximize the pool

aggregate

MPNN Message passing neural networks

Definition

The formal framework of spatial convolution

The formula

Information transmission
- M
  - node v Information of adjacent nodes , Side information , Put it all together
Status update
- Update
  - After receiving the information from each neighbor , Combine the status of the node at a point in time hv, Update your status

characteristic

Added edge information

species

Figure convolution network
Neural FPs
Gate graph neural network
SpectralGNN

Picture attention network

The self attention between vertex and adjacent points is calculated through the attention mechanism

Ideas

Put the original node features , from F Dimension is converted to F’ dimension
- And then through the function attention, Map to an attention weight
  - eij Indicates the node j be relative to i The importance of

Method

Splice the converted node information
- Multiply by a parameter
Usually, a single-layer feedforward neural network and a LeaklyRELU As a nonlinear activation function eij
softmax Activation and normalization
- Get attention weight
Based on attention weight , Start updating nodes

GCN Approximation as a subgraph function

GCN

MPNN

Mainly matrix multiplication

Graph explaination

Evaluation of graphs

It is difficult to optimize , Sparse architecture , It is also a dynamic architecture

How to speed up is difficult

Very sensitive to super parameters

How to sample ？
What super parameters are used ？

There are few applications in industry

Distil

Determine whether the graph is isomorphic

weisfeiler-Lehman subtree

Pooling of graphs

Clustering and pooling

Conventional pooling thinking
- Define neighbors , Take the maximum or average value in the neighborhood
According to the diagram structure , Clustering , Select a class of nodes to pool
- Spectral clustering
  - https://www.cnblogs.com/pinard/p/6221564.html
- graclus Multilevel clustering algorithm
  - Select a point , Then merge points , After the meeting 2 individual

DiffPool

Learnable pooling , Give a certain parameter

TopK Pooling

Map nodes to nodes through attributes *1 D on , Then sort by importance , determine TOPK Important nodes
According to the selected node , Determine the subgraph
Then combine subgraphs and attributes , Perform downstream calculations
shortcoming
- More sparse than the original , A lot of information is missing

Graph embedding method

Graph embedding method based on random walk

DeepWalk
- Random sampling of a large number of fixed length paths
  - Each path is equivalent to a sentence
    - A node is equivalent to a word
- use skip-gram The model maximizes the contribution probability of the central node and its front and rear neighbor nodes on the road
  - Based on the embedded representation of the node, we get
- Negative sampling can be used to improve training efficiency
  - Try to distinguish between the target node and other noises
    - Noise is called negative sample
node2vec
- Judge according to the node , Probability of being visited, etc

Unsupervised learning of graphs

The self encoder of FIG

Encoder-Decoder
- The goal is ： Minimize reconstruction errors
Graph AE
Graph VAE
- A posteriori probability is parameterized by neural network
  - Then the approximate solution of the above objective function is obtained by Monte Carlo sampling method
- q(z|A, X) It's an encoder
  - To get the part of the implicit variable
  - Convolution network by graph , To parameterize q(z|A, X)
  - N Is the Gaussian component , Parameters μ and * Both are obtained by convolution network of two side graphs
- p(A | Z) It's a decoder
  - Used to reconstruct the graph structure A

Maximum mutual information

Maximize mutual information

Figure pre training of neural network

Node level tasks

Context prediction
- Get the vector representation of the node
- Find the context graph around the node pair
  - Greater than or equal to r1, Less than or equal to r2 A subgraph composed of all points of
- Determine a neighborhood and a context graph , Whether it belongs to the same node
Property masking
- Some attributes in random mask nodes
- Training , Then predict

Graph level tasks

Attribute prediction
Similarity prediction
Training process
- First, do the node level self-monitoring pre training task
- Then do supervised training at the drawing level

The problem of large-scale learning

Training efficiency and scalability

reason

The Internet is big , Large memory , Training costs a lot , Long time
Big picture , Information explosion

Point sampling

PinSAGE
- be based on GraphSAGE, Increase the importance of neighbor nodes during sampling , By random walk , Judge according to frequency

Layer sampling

FastGCN
- Layer sampling on neural networks

Figure sampling

Cluster-GCN
- Using graph clustering algorithm , Divide the graph into small pieces
- Some small blocks are randomly selected for each training to form a subgraph
- Complete graph convolution network calculation on subgraphs
  - And directly get the loss function

How to solve over smoothing

Add random walk model , So that information can spread to infinity

page rank Ideas

Spread far , Keep the original information at the same time

Add residual connection

The original way
- Reference resources resnet
The way to improve
- Node pairs are close neighbors , More influential
  - Give the previous layer more weight

Jump knowledge network （JK-Net）

Output each layer of graph convolution , Finally, they get together
Polymerization methods
- sum
- mean
- max
- LSTM

DropEdge

dropout Extension on graph neural network
Randomly delete some edges of adjacency matrix

Unsupervised learning of graphs

Reference link

https://www.cnblogs.com/siviltaram/p/graph_neural_network_2.html

Applications related to biochemical medicine

Predict molecular properties

Chemical reaction prediction

Given some reactant molecular diagrams Gr, The corresponding explanation after predicting the chemical reaction Gp
- Contains many different molecules , Form a common disconnected graph
Using a specific graph neural network to learn the embedded representation of each atomic node
Predict the possible reaction fraction of each pair of atoms formed by two atoms
Pick out K The atom with the highest fraction , List possible compounds that conform to the rules
Another neural network is used to predict these candidate products , Sort according to the probability

Graph generation model

Subtheme 3

Data sets

MUTAG Data sets
- Classify whether they are aromatics
TOX21 Data sets
- Classify different toxicity
NCI-I Data sets
- The barrier of classification to cancer

原网站

版权声明
本文为[Hua Weiyun]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203011624199621.html