当前位置:网站首页>[paper reading] unpaired image to image translation using cycle consistent advantageous networks
[paper reading] unpaired image to image translation using cycle consistent advantageous networks
2022-07-25 20:24:00 【xiongxyowo】
[ Address of thesis ][ Code ][ICCV 17]
Abstract
Image to image translation is a kind of visual and graphic problems , The goal is to use training of a set of aligned image pairs to learn the mapping between input images and output images . However , For many tasks , Paired training data is not available . We propose a way , Learn to remove images from the source domain without pairing instances X Translate to the target domain Y. Our goal is to learn a mapping G : X − > Y G:X->Y G:X−>Y, send G ( X ) G(X) G(X) Image distribution and use of antagonistic loss Y Y Y The distribution is indistinguishable . Because this mapping is highly under constrained , Let's map it to a reverse F : Y − > X F: Y -> X F:Y−>X Combine , And introduce a cyclic consistency loss to promote F ( G ( X ) ) X F(G(X)) ~ X F(G(X)) X( vice versa ). Qualitative results are presented on several tasks that do not have paired training data , Including the transfer of collection style 、 Object conversion 、 Shift of seasons 、 Photo enhancement and so on . The quantitative comparison with several previous methods shows the superiority of our method .
Method
This article is famous CycleGAN, The core idea of the method is as follows :
It consists of two generators ( G , F ) (G, F) (G,F) And two discriminators ( D X , D Y ) (D_X, D_Y) (DX,DY) constitute . For the input source domain image X X X, Send it to the first generator G G G, Then you can get a false target domain image G ( X ) G(X) G(X). Judging device D Y D_Y DY Need to be able to distinguish the actual target domain image Y Y Y And false target domain images G ( X ) G(X) G(X), So that the generated G ( X ) G(X) G(X) The style features included are more ; At the same time , Swap the target domain with the source domain , Then the target and image Y Y Y Sending a generator F F F after , You can get a fake source domain image G ( Y ) G(Y) G(Y). Judging device D X D_X DX You need to be able to distinguish the actual source domain image X X X And fake source domain images G ( Y ) G(Y) G(Y), So that the generated G ( Y ) G(Y) G(Y) The style features contained are more realistic .
The advantage of this is , Because the task of image style conversion in this paper is " Unsupervised ", No matching " From the - Target domain " The image is right , It is equivalent to only being able to constrain whether the generated image meets the new style , There is no way to constrain whether the generated image is consistent in content . And with cycle After the form , After a picture goes in , First, it becomes G ( X ) G(X) G(X), And then it becomes F ( G ( X ) ) F(G(X)) F(G(X)), By restraint X X X Should be the same F ( G ( X ) ) F(G(X)) F(G(X)) As similar as possible , So as to ensure that the network still needs to maintain details as much as possible while learning how to change styles , To achieve one " Self supervision ".
The loss function consists of two parts , One is to restrict the image style to complete the conversion of the confrontation loss : L GAN ( G , D Y , X , Y ) = E y ∼ p data ( y ) [ log D Y ( y ) ] + E x ∼ p data ( x ) [ log ( 1 − D Y ( G ( x ) ) ] \mathcal{L}_{\text{GAN}}(G,\ D_{Y},\ X,\ Y) = \mathbb{E}_{y\sim p_{\text{data}}(y)}[\log D_{Y}(y)]+\mathbb{E}_{x\sim p_{\text{data}}(x)}[\log(1- D_{Y}(G(x))] LGAN(G, DY, X, Y)=Ey∼pdata(y)[logDY(y)]+Ex∼pdata(x)[log(1−DY(G(x))]
This loss is necessary as long as style conversion is done , There's nothing to say . The other is the cyclic consistency loss of keeping the constraint content consistent : L cyc ( G , F ) = E x ∼ p data ( x ) [ ∥ F ( G ( x ) ) − x ∥ 1 ] + E y ∼ p data ( ( y ) [ ∥ G ( F ( y ) ) − y ∥ 1 ] \mathcal{L}_{\text{cyc}}(G,\ F)=\mathbb{E}_{x\sim p_{\text{data}}(x)}[\Vert F(G(x))-x \Vert_{1}]+\mathbb{E}_{y\sim p_{\text{data}}((y)}[\Vert G(F(y))-y \Vert_{1}] Lcyc(G, F)=Ex∼pdata(x)[∥F(G(x))−x∥1]+Ey∼pdata((y)[∥G(F(y))−y∥1]
For this kind of " Unsupervised " In terms of image style conversion , The upper limit of its effect is Pix2Pix such " Supervised " In the form of .CycleGAN One of the main problems of is the inability to deal with geometric transformations , Because the loss of cyclic consistency will make the content of the image as unchanged as possible in the process of converting to the target domain , That is, it is more likely to be " cat => cat => cat ", And it's hard " cat => Dog => cat ".
边栏推荐
- 移动web布局方法
- SecureCRT garbled code solution [easy to understand]
- Stock software development
- Google pixel 6A off screen fingerprint scanner has major security vulnerabilities
- Chapter VI modified specification (SPEC) class
- Fanoutexchange switch code tutorial
- "Chain" connects infinite possibilities: digital asset chain, wonderful coming soon!
- CarSim simulation quick start (XV) - ADAS sensor objects of CarSim sensor simulation (1)
- wallys//IPQ5018/IPQ6010/PD-60 802.3AT Input Output 10/100/1000M
- 【TensorRT】动态batch进行推理
猜你喜欢
![[noi simulation] string matching (suffix automata Sam, Mo team, block)](/img/db/3ccb00e78bba293acdae91ffa72a2c.png)
[noi simulation] string matching (suffix automata Sam, Mo team, block)

导电滑环在机械设备方面的应用

Go language go language built-in container
![[today in history] July 15: Mozilla foundation was officially established; The first operation of Enigma cipher machine; Nintendo launches FC game console](/img/7d/7a01c8c6923077d6c201bf1ae02c8c.png)
[today in history] July 15: Mozilla foundation was officially established; The first operation of Enigma cipher machine; Nintendo launches FC game console

Socket error Event: 32 Error: 10053. Connection closing...Socket close

Advantages of network virtualization of various manufacturers

智能电子界桩自然保护区远程监控解决方案
![[advanced mathematics] [4] indefinite integral](/img/4f/2aae654599fcc0ee85cb1ba46c9afd.png)
[advanced mathematics] [4] indefinite integral
![[today in history] June 29: SGI and MIPS merged; Microsoft acquires PowerPoint developer; News corporation sells MySpace](/img/86/abeb82927803712a98d2018421c3a7.png)
[today in history] June 29: SGI and MIPS merged; Microsoft acquires PowerPoint developer; News corporation sells MySpace

Network protocol: TCP part2
随机推荐
Successfully solved typeerror: a bytes like object is required, not 'str‘
Fanoutexchange switch code tutorial
MySQL date [plus sign / +] condition filtering problem
Technology cloud report: what is the difference between zero trust and SASE? The answer is not really important
[today in history] July 1: the father of time-sharing system was born; Alipay launched barcode payment; The first TV advertisement in the world
移动web布局方法
Web crawler principle analysis "suggestions collection"
qml 结合 QSqlTableModel 动态加载数据 MVC「建议收藏」
[today in history] July 7: release of C; Chrome OS came out; "Legend of swordsman" issued
Formatdatetime explanation [easy to understand]
[advanced mathematics] [6] differential calculus of multivariate functions
【NOI模拟赛】字符串匹配(后缀自动机SAM,莫队,分块)
谷歌Pixel 6a屏下指纹扫描仪存在重大安全漏洞
网络爬虫原理解析「建议收藏」
TGA file format (waveform sound file format)
Arrow 之 Parquet
[today in history] July 8: PostgreSQL release; SUSE acquires the largest service provider of k8s; Activision Blizzard merger
9. < tag dynamic programming and subsequence, subarray> lt.718. Longest repeated subarray + lt.1143. Longest common subsequence
CarSim仿真快速入门(十五)—CarSim传感器仿真之ADAS Sensor Objects (1)
The use of new promise, async and await in the project, and the practical application of promise.all in the project