当前位置:网站首页>[Code] neural symbol generation machine

[Code] neural symbol generation machine

2022-06-10 18:50:00 User 1908973

https://github.com/JindongJiang/GNM

Abstract

Harmonizing notation and distributed representation is a critical challenge , It can potentially solve the limitations of current deep learning . lately , Through the generation object centric representation model , Significant progress has been made in this direction . Although the learning recognition model infers the object-centered symbolic representation from the original image in an unsupervised way , Like the bounding box , But no such model can provide another important capability for generating models , That is, according to the structure of the world density of learning ( sampling ). In this paper , We propose a neural symbol generation machine , This is a generation model that combines the advantages of distributed and symbolic representation , Support structured representation of symbolic components and density based generation . These two key attributes are realized through two potential layers , Global distributed potential and structured symbolic potential diagrams for flexible density modeling . To increase the flexibility of the model in this hierarchy , We also proposed StructDRAW prior. Experiments show that , This model is obviously superior to the previous structured representation model and the latest unstructured generation model in terms of structural accuracy and image generation quality . Our code 、 Data sets and training models are available at the following web site https://github.com/JindongJiang/GNM

Introduce

The two core capabilities of human and machine intelligence are the abstract representation of the learning world , And generate imagination in a way that reflects the causal structure of the world . Deep latent variable model , Such as variational automatic encoder (VAEs) [31,39] Provides an elegant probabilistic framework , Learn both skills unsupervised and end-to-end trainable . However , In most VAEs The single distribution vector representation used in provides only weak or implicit structures induced by independent priors in practice . therefore , In expressing complex 、 High dimensional and structured observation , For example, scene images containing various objects , This representation is difficult to express useful structural properties , E.g. modularity 、 Composability and interpretability . However , These features are considered to be the key to solve the current limitations of deep learning in various systems 2 [29] Reasoning and other related abilities [6], Causal learning [40,37], Accountability [13], And the distributed generalization ability of the system [3,46]. By learning to represent observations as combinations of their entity representations , Especially the object - centered scene image mode , Significant progress has been made in addressing this challenge [15,32,18,45,8,17,14,12,33,11,26,48]. These models are equipped with more explicit inductive bias , Such as the spatial position of the object 、 Symbolic representation and synthetic scene modeling , It provides a method to identify and generate a given observation by composition based on the representation of interactive entities . However , Most of these models do not support another key capability of generating models : Generate hypothetical observations by learning the density of observation data . Although this ability to imagine according to the density of possible worlds plays a crucial role in the world models required for planning and model-based reinforcement, for example

[22, 21, 1, 36, 24, 38, 23], In the past, most entity based models can only synthesize artificial images by manually configuring the representation , Not according to the observed density of the bottom layer . although VAEs Support this function [31,19], Lack of explicit synthetic structure in its representation , When generating complex images , It is easy to lose the global structure consistency in practice [44,19]. In this paper , We propose a neural symbol generation machine (GNM), This is a probability generation model , By supporting symbolic entity based representation and distributed representation , It combines the advantages of the two worlds . therefore , The model can express the observed values by symbolic components , And the observation value can be generated according to the basic density . We have two potential levels in GNM Both of these key attributes are implemented in : The top level generates a globally distributed potential representation for flexible density modeling , The underlying layer generates potential structure diagrams based on entity and symbol representation from the global potential . Besides , We proposed StructDRAW, A structural feature graph supported by autoregressive prior , To improve the expression ability of potential structure diagram . In the experiment , We found that in terms of structural accuracy and image clarity , This model is obviously superior to the previous structured representation model and the highly expressive unstructured generation model .

Please refer to the original text for more information .

原网站

版权声明
本文为[User 1908973]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/161/202206101749569413.html