当前位置：网站首页>Reverse thinking: making cartoon photos real

Reverse thinking: making cartoon photos real

2022-06-11 04:55:00 【Shenlan Shenyan AI】

Before PaddleGAN Interesting applications for have mushroomed , A lot of projects are xxx Animation . At that time, there was a very common idea why everyone would engage in animation , This is probably due to the reason of the second meta culture , Or the application of animation 、 Commercial value .

Suddenly, an idea came out , Why doesn't anyone make animation live , Then I went to the project search , As a result, it seems that no one did this project . At first I thought it was very difficult to realize my idea , After discussing with the great gods later , In fact, I think the implementation principle is also very simple , Is to exchange the labels in the cartoon data set . For example, cartoon portraits , Namely A to B(A It's real ,B It's animation ,B Is the label ). So the cartoon personification of this project is B to A(A It's real ,B It's animation ,A Is the label ).

Realization effect

Original picture of real person ：

Realization effect ：

Original picture of real person ：

You can see that the effect is already very realistic ！

Download installation package

import paddle import paddle.nn as nn from paddle.io import Dataset, DataLoader import os import cv2 import numpy as np from tqdm import tqdm import matplotlib.pyplot as plt %matplotlib inline

Decompress data

Data preparation ：

Real life data comes from seeprettyface.
Data preprocessing （ For details, see photo2cartoon project ）.

Use photo2cartoon The project generates cartoon data corresponding to human data .

# Decompress data !unzip -q data/data79149/cartoon_A2B.zip -d data/

Data visualization

（ The data set has been divided ）

# Training data statistics train_names = os.listdir('data/cartoon_A2B/train') print(f' Training set data volume : {len(train_names)}') # Test data statistics test_names = os.listdir('data/cartoon_A2B/test') print(f' Amount of test set data : {len(test_names)}') # Training data visualization imgs = [] for img_name in np.random.choice(train_names, 3, replace=False): imgs.append(cv2.imread('data/cartoon_A2B/train/'+img_name)) img_show = np.vstack(imgs)[:,:,::-1] plt.figure(figsize=(10, 10)) plt.imshow(img_show) plt.show()

Be careful ：

A On behalf of real people ,B Represents cartoon . Source reference code yes A to B. This experimental project uses B to A

And because the dataset is a collection of Live action photos and cartoon pictures are stitched together , Use the division width to distinguish the original image from the label . For example, source program Yes, it is Width [ : 256] Divided into real people （ Original drawing ）,[256 : ] Divided into cartoons （ The label ）

To implement this project, you have to switch them over .

class PairedData(Dataset): def __init__(self, phase): super(PairedData, self).__init__() self.img_path_list = self.load_A2B_data(phase) # Get the data list self.num_samples = len(self.img_path_list) # Data volume def __getitem__(self, idx): img_A2B = cv2.imread(self.img_path_list[idx]) # Reading data img_A2B = img_A2B.astype('float32') / 127.5 - 1. # normalization img_A2B = img_A2B.transpose(2, 0, 1) # HWC -> CHW img_A = img_A2B[..., 256:] # Cartoon （ Original picture ） img_B = img_A2B[..., :256] # Human figure （ label ） return img_A, img_B def __len__(self): return self.num_samples @staticmethod def load_A2B_data(phase): assert phase in ['train', 'test'], "phase should be set within ['train', 'test']" # Reading data sets , Each image in the data contains a photo and a corresponding cartoon . data_path = 'data/cartoon_A2B/'+phase return [os.path.join(data_path, x) for x in os.listdir(data_path)] paired_dataset_train = PairedData('train') paired_dataset_test = PairedData('test')

Define generator

class UnetGenerator(nn.Layer): def __init__(self, input_nc=3, output_nc=3, ngf=64): super(UnetGenerator, self).__init__() self.down1 = nn.Conv2D(input_nc, ngf, kernel_size=4, stride=2, padding=1) self.down2 = Downsample(ngf, ngf*2) self.down3 = Downsample(ngf*2, ngf*4) self.down4 = Downsample(ngf*4, ngf*8) self.down5 = Downsample(ngf*8, ngf*8) self.down6 = Downsample(ngf*8, ngf*8) self.down7 = Downsample(ngf*8, ngf*8) self.center = Downsample(ngf*8, ngf*8) self.up7 = Upsample(ngf*8, ngf*8, use_dropout=True) self.up6 = Upsample(ngf*8*2, ngf*8, use_dropout=True) self.up5 = Upsample(ngf*8*2, ngf*8, use_dropout=True) self.up4 = Upsample(ngf*8*2, ngf*8) self.up3 = Upsample(ngf*8*2, ngf*4) self.up2 = Upsample(ngf*4*2, ngf*2) self.up1 = Upsample(ngf*2*2, ngf) self.output_block = nn.Sequential( nn.ReLU(), nn.Conv2DTranspose(ngf*2, output_nc, kernel_size=4, stride=2, padding=1), nn.Tanh() ) def forward(self, x): d1 = self.down1(x) d2 = self.down2(d1) d3 = self.down3(d2) d4 = self.down4(d3) d5 = self.down5(d4) d6 = self.down6(d5) d7 = self.down7(d6) c = self.center(d7) x = self.up7(c, d7) x = self.up6(x, d6) x = self.up5(x, d5) x = self.up4(x, d4) x = self.up3(x, d3) x = self.up2(x, d2) x = self.up1(x, d1) x = self.output_block(x) return x class Downsample(nn.Layer): # LeakyReLU => conv => batch norm def __init__(self, in_dim, out_dim, kernel_size=4, stride=2, padding=1): super(Downsample, self).__init__() self.layers = nn.Sequential( nn.LeakyReLU(0.2), nn.Conv2D(in_dim, out_dim, kernel_size, stride, padding, bias_attr=False), nn.BatchNorm2D(out_dim) ) def forward(self, x): x = self.layers(x) return x class Upsample(nn.Layer): # ReLU => deconv => batch norm => dropout def __init__(self, in_dim, out_dim, kernel_size=4, stride=2, padding=1, use_dropout=False): super(Upsample, self).__init__() sequence = [ nn.ReLU(), nn.Conv2DTranspose(in_dim, out_dim, kernel_size, stride, padding, bias_attr=False), nn.BatchNorm2D(out_dim) ] if use_dropout: sequence.append(nn.Dropout(p=0.5)) self.layers = nn.Sequential(*sequence) def forward(self, x, skip): x = self.layers(x) x = paddle.concat([x, skip], axis=1) return x

Define authenticators

class NLayerDiscriminator(nn.Layer): def __init__(self, input_nc=6, ndf=64): super(NLayerDiscriminator, self).__init__() self.layers = nn.Sequential( nn.Conv2D(input_nc, ndf, kernel_size=4, stride=2, padding=1), nn.LeakyReLU(0.2), ConvBlock(ndf, ndf*2), ConvBlock(ndf*2, ndf*4), ConvBlock(ndf*4, ndf*8, stride=1), nn.Conv2D(ndf*8, 1, kernel_size=4, stride=1, padding=1), nn.Sigmoid() ) def forward(self, input): return self.layers(input) class ConvBlock(nn.Layer): # conv => batch norm => LeakyReLU def __init__(self, in_dim, out_dim, kernel_size=4, stride=2, padding=1): super(ConvBlock, self).__init__() self.layers = nn.Sequential( nn.Conv2D(in_dim, out_dim, kernel_size, stride, padding, bias_attr=False), nn.BatchNorm2D(out_dim), nn.LeakyReLU(0.2) ) def forward(self, x): x = self.layers(x) return x

Instantiate the generator , Discriminator

generator = UnetGenerator() discriminator = NLayerDiscriminator() out = generator(paddle.ones([1, 3, 256, 256])) print(' Generator output size ：', out.shape) out = discriminator(paddle.ones([1, 6, 256, 256])) print(' Discriminator output size ：', out.shape)

Define the training parameters

# Hyperparameters LR = 1e-4 BATCH_SIZE = 8 EPOCHS = 100 # Optimizer optimizerG = paddle.optimizer.Adam( learning_rate=LR, parameters=generator.parameters(), beta1=0.5, beta2=0.999) optimizerD = paddle.optimizer.Adam( learning_rate=LR, parameters=discriminator.parameters(), beta1=0.5, beta2=0.999) # Loss function bce_loss = nn.BCELoss() l1_loss = nn.L1Loss() # dataloader data_loader_train = DataLoader( paired_dataset_train, batch_size=BATCH_SIZE, shuffle=True, drop_last=True ) data_loader_test = DataLoader( paired_dataset_test, batch_size=BATCH_SIZE )

Training effect

The first column is cartoons （ Original picture ）, The second column is a picture of a real person （ label ）, The third column is the result of learning

What you just learned ：

100epochs The effect of ：

We can see that it has achieved good results

results_save_path = 'work/results' os.makedirs(results_save_path, exist_ok=True) # Save every epoch Test results weights_save_path = 'work/weights' os.makedirs(weights_save_path, exist_ok=True) # Save the model for epoch in range(EPOCHS): for data in tqdm(data_loader_train): real_A, real_B = data optimizerD.clear_grad() # D(real) real_AB = paddle.concat((real_A, real_B), 1) d_real_predict = discriminator(real_AB) d_real_loss = bce_loss(d_real_predict, paddle.ones_like(d_real_predict)) # D(fake) fake_B = generator(real_A).detach() fake_AB = paddle.concat((real_A, fake_B), 1) d_fake_predict = discriminator(fake_AB) d_fake_loss = bce_loss(d_fake_predict, paddle.zeros_like(d_fake_predict)) # train D d_loss = (d_real_loss + d_fake_loss) / 2. d_loss.backward() optimizerD.step() optimizerG.clear_grad() # D(fake) fake_B = generator(real_A) fake_AB = paddle.concat((real_A, fake_B), 1) g_fake_predict = discriminator(fake_AB) g_bce_loss = bce_loss(g_fake_predict, paddle.ones_like(g_fake_predict)) g_l1_loss = l1_loss(fake_B, real_B) * 100. g_loss = g_bce_loss + g_l1_loss # train G g_loss.backward() optimizerG.step() print(f'Epoch [{epoch+1}/{EPOCHS}] Loss D: {d_loss.numpy()}, Loss G: {g_loss.numpy()}') if (epoch+1) % 10 == 0: paddle.save(generator.state_dict(), os.path.join(weights_save_path, 'epoch'+str(epoch+1).zfill(3)+'.pdparams')) # test generator.eval() with paddle.no_grad(): for data in data_loader_test: real_A, real_B = data break fake_B = generator(real_A) result = paddle.concat([real_A[:3], real_B[:3], fake_B[:3]], 3) result = result.detach().numpy().transpose(0, 2, 3, 1) result = np.vstack(result) result = (result * 127.5 + 127.5).astype(np.uint8) cv2.imwrite(os.path.join(results_save_path, 'epoch'+str(epoch+1).zfill(3)+'.png'), result) generator.train()

test

# Load weights for the generator last_weights_path = os.path.join(weights_save_path, sorted(os.listdir(weights_save_path))[-1]) print(' Load weights :', last_weights_path) model_state_dict = paddle.load(last_weights_path) generator.load_dict(model_state_dict) generator.eval() Reading data test_names = os.listdir('data/cartoon_A2B/test') # img_name = np.random.choice(test_names) img_name = '01481.png' img_A2B = cv2.imread('data/cartoon_A2B/test/'+img_name) img_A = img_A2B[:, 256:] # Cartoon （ The input ） img_B = img_A2B[:, :256] # Human figure （ That is, the predicted results ） # img_A= cv2.imread('data/test4.png') # img_A = img_A[:, 256:] g_input = img_A.astype('float32') / 127.5 - 1 # normalization g_input = g_input[np.newaxis, ...].transpose(0, 3, 1, 2) # NHWC -> NCHW g_input = paddle.to_tensor(g_input) # numpy -> tensor g_output = generator(g_input) g_output = g_output.detach().numpy() # tensor -> numpy g_output = g_output.transpose(0, 2, 3, 1)[0] # NCHW -> NHWC g_output = g_output * 127.5 + 127.5 # Anti normalization g_output = g_output.astype(np.uint8) img_show = np.hstack([img_A, g_output])[:,:,::-1] plt.figure(figsize=(8, 8)) plt.imshow(img_show) plt.show()

thus , The animation photo live action project has been completed , Most of this project is based on reference projects , Just a few changes .

author ： Fast implementation AI idea

｜ About Deep extension technology ｜

Shenyan technology was founded in 2018 year 1 month , Zhongguancun High tech enterprise , It is an enterprise with the world's leading artificial intelligence technology AI Service experts . In computer vision 、 Based on the core technology of natural language processing and data mining , The company launched four platform products —— Deep extension intelligent data annotation platform 、 Deep extension AI Development platform 、 Deep extension automatic machine learning platform 、 Deep extension AI Open platform , Provide data processing for enterprises 、 Model building and training 、 Privacy computing 、 One stop shop for Industry algorithms and solutions AI Platform services .

原网站

版权声明
本文为[Shenlan Shenyan AI]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/03/202203020544261095.html