当前位置:网站首页>Learning pyramid context encoder network for high quality image painting paper notes
Learning pyramid context encoder network for high quality image painting paper notes
2022-07-24 05:00:00 【Magic__ Conch】
IEEE Conference Proceedings arXiv: Computer Vision and Pattern Recognition Jan 2019
List of articles
Problems solved and improvement
Existing methods cannot be combined Direct visual information and deep semantic information .
- patch search And others lack the understanding of high-level semantic consistency .
- generative models Of stacked constructions and poolings There is over-smooth, lack of visually-realistic Other questions .
Model structures,
With UNet For the skeleton , In the image-level and feature-level Fill the missing area on .
pyramid-context encoder: Use cross-layer The mechanism of attention transmission and pyramid filling

Each level 𝜓 From this layer feature map - 𝜙 and On a higher level 𝜓 Common process ATN( In style f) obtain .
Attention Transfer Network(ATN)( It's the one above f)
One 、 Reconstruct feature map from high-level semantics ψ L \psi^L ψL Fill in the next layer of feature map ϕ L − 1 \phi^{L-1} ϕL−1, To get the reconstruction feature map of the next layer ψ L − 1 \psi^{L-1} ψL−1.
First extract ψ l ψ^l ψl, And then calculate patch Cosine similarity between .

Then use on similarity Softmax Function to get each patch My attention score (Attention Score).

After obtaining the attention score of high-level semantic features ( Namely the above formula α i , j l α_{i,j}^l αi,jl), The feature map of the next level can be weighted by the attention score context Fill in .

Calculate all patch after , You can get ψ l − 1 ψ^{l−1} ψl−1 ( above i All calculations of can be formulated into convolution calculation for end-to-end training ).
Two 、 elaboration
The multi-scale context information is aggregated by four groups of dilated convolutions with different rates , This design ensures the consistency between the structure of the final reconstruction feature and the environment , Improved the repair effect of the test .
multi-scale decoder
- multi-scale decoder Approved by ATN Reconstruction features and encoder Of latent feature Make input .
- decoder Characteristic graph φ L − 1 、 φ L − 2 φ^{L−1} 、φ^{L−2} φL−1、φL−2 etc. , It is calculated from the following formula .

among , from ATN The generated reconstruction feature is that the missing region encodes lower level information , It is beneficial to use fine-grained details to generate visually realistic results ; Compact extracted by convolution latent When the feature can't find the object in the area outside the missing , Synthesize new objects .
Semantic consistency depends on deep convolution , The texture is consistent ATN Shallow features of reconstruction .
- Pyramid L1 losses

An adversarial training loss
The total loss function consists of :Generator + Discriminator
- Use PatchGAN(Image-to-Image Translation with Conditional Adversarial Networks) As part of this article discriminator, At the same time, spectral normalization is used to stabilize the training .
- In this paper ,pyramid-context encoder and multi-scale decoder constitute Generator.
The definition of the loss function :
Definition generator The final prediction result z:
z = G ( x ⊙ ( 1 − M ) , M ) ⊙ M + x ⊙ ( 1 − M ) z=G(x ⊙(1−M), M)⊙M+x ⊙(1−M) z=G(x⊙(1−M),M)⊙M+x⊙(1−M)discriminator The confrontation loss function of can be expressed as :

generator The confrontation loss function of is :

PEN-NET By minimizing counter losses and pyramid L1 Loss ( At the end of the last section ) To optimize , The overall objective function is :

model analysis
analysis pyramid L1 Loss and ATN The role of these two network components .
Pyramid L1 Loss
Pyramid L1 Loss The loss function is gradually refined at each scale ,pyramid loss It is conducive to decoding compact features layer by layer .
ATN
Cross layer attention transmission mechanism to U-Net Skeleton brings improvement .
The first behavior is pure... Without using any attention mechanism U-Net The Internet , The second line is no deeper guidance Of CA Method , The third layer is ATN Apply to U-Net Architectural results .
边栏推荐
- What if the computer desktop gets stuck? Introduction of solutions to computer crash and desktop jamming
- 力。操处于业务低峰期。进口调用会帮您准备时,每个字
- Yum to see which installation package provides a command
- Infineon launched the world's first TPM security chip with post quantum encryption technology for firmware update
- Xiaomi finance was officially launched today (May 11) with a free 10000 yuan experience fee attached to the official address
- Print leap years between 1000 and 2000
- 排序——QuickSort
- Kingbase v8r6 cluster installation and deployment case - script online one click capacity reduction
- JDBC MySQL basic operations
- 京东方高级副总裁姜幸群:AIoT技术赋能企业物联网转型
猜你喜欢

Esp32 tutorial (I): vscode+platform and vscade+esp-idf

OWA dynamic password SMS authentication scheme solves the problem of outlook email two factor authentication

How to set up an internal wiki for your enterprise?

HMS core discovery Episode 16 live broadcast preview | play AI's new "sound" state with tiger pier

472-82 (22, 165, 39, sword finger offer II 078, 48. Rotate image)

mapreduce概念

Problems and solutions of QT (online installation package) crash in win10 installation

想知道一个C程序是如何进行编译的吗?——带你认识程序的编译

Event extraction and documentation (2020-2021)

Want to know how a C program is compiled—— Show you the compilation of the program
随机推荐
Transpose of array sparse matrix
Event extraction and documentation (2019)
How to register and apply for free for Apple Developer account in order to enjoy the upgrade experience at the first time
Print leap years between 1000 and 2000
LabVIEW主VI冻结挂起
C language: generation of random numbers
Ben, reducing online importance is the same. Abnormal instance CP operation found
E d-piece system is nfdavi oriented, reaching a high level for engineers
P loose integration of SDA during a configuration file. But in fact
链接预测中训练集、验证集以及测试集的划分(以PyG的RandomLinkSplit为例)
Unable to delete the file prompt the solution that the file cannot be deleted because the specified file cannot be found
Memorandum 2022
SHP building contour data restoration, 3D urban white film data production
排序——QuickSort
Recruitment | embedded software (MCU) Engineer
How to make yourself look young in how old robot? How old do I look? Younger method skills
Zhaoyi innovation gd25wdxxk6 SPI nor flash product series comes out
What is the proper resolution of the computer monitor? Introduction to the best resolution of monitors of various sizes and the selection of different wallpapers
Chiitoitsu(期望dp)
MySQL transaction and its problems and isolation level