当前位置:网站首页>Reverse thinking: making cartoon photos real
Reverse thinking: making cartoon photos real
2022-06-11 04:55:00 【Shenlan Shenyan AI】
Before PaddleGAN Interesting applications for have mushroomed , A lot of projects are xxx Animation . At that time, there was a very common idea why everyone would engage in animation , This is probably due to the reason of the second meta culture , Or the application of animation 、 Commercial value .
Suddenly, an idea came out , Why doesn't anyone make animation live , Then I went to the project search , As a result, it seems that no one did this project . At first I thought it was very difficult to realize my idea , After discussing with the great gods later , In fact, I think the implementation principle is also very simple , Is to exchange the labels in the cartoon data set . For example, cartoon portraits , Namely A to B(A It's real ,B It's animation ,B Is the label ). So the cartoon personification of this project is B to A(A It's real ,B It's animation ,A Is the label ).
Realization effect

Original picture of real person :

Realization effect :

Original picture of real person :

You can see that the effect is already very realistic !
Download installation package
import paddle
import paddle.nn as nn
from paddle.io import Dataset, DataLoader
import os
import cv2
import numpy as np
from tqdm import tqdm
import matplotlib.pyplot as plt
%matplotlib inline
Decompress data
Data preparation :
Real life data comes from seeprettyface.
Data preprocessing ( For details, see photo2cartoon project ).

Use photo2cartoon The project generates cartoon data corresponding to human data .
# Decompress data
!unzip -q data/data79149/cartoon_A2B.zip -d data/
Data visualization
( The data set has been divided )
# Training data statistics
train_names = os.listdir('data/cartoon_A2B/train')
print(f' Training set data volume : {len(train_names)}')
# Test data statistics
test_names = os.listdir('data/cartoon_A2B/test')
print(f' Amount of test set data : {len(test_names)}')
# Training data visualization
imgs = []
for img_name in np.random.choice(train_names, 3, replace=False):
imgs.append(cv2.imread('data/cartoon_A2B/train/'+img_name))
img_show = np.vstack(imgs)[:,:,::-1]
plt.figure(figsize=(10, 10))
plt.imshow(img_show)
plt.show()
Be careful :
A On behalf of real people ,B Represents cartoon . Source reference code yes A to B. This experimental project uses B to A
And because the dataset is a collection of Live action photos and cartoon pictures are stitched together , Use the division width to distinguish the original image from the label . For example, source program Yes, it is Width [ : 256] Divided into real people ( Original drawing ),[256 : ] Divided into cartoons ( The label )
To implement this project, you have to switch them over .
class PairedData(Dataset):
def __init__(self, phase):
super(PairedData, self).__init__()
self.img_path_list = self.load_A2B_data(phase) # Get the data list
self.num_samples = len(self.img_path_list) # Data volume
def __getitem__(self, idx):
img_A2B = cv2.imread(self.img_path_list[idx]) # Reading data
img_A2B = img_A2B.astype('float32') / 127.5 - 1. # normalization
img_A2B = img_A2B.transpose(2, 0, 1) # HWC -> CHW
img_A = img_A2B[..., 256:] # Cartoon ( Original picture )
img_B = img_A2B[..., :256] # Human figure ( label )
return img_A, img_B
def __len__(self):
return self.num_samples
@staticmethod
def load_A2B_data(phase):
assert phase in ['train', 'test'], "phase should be set within ['train', 'test']"
# Reading data sets , Each image in the data contains a photo and a corresponding cartoon .
data_path = 'data/cartoon_A2B/'+phase
return [os.path.join(data_path, x) for x in os.listdir(data_path)]
paired_dataset_train = PairedData('train')
paired_dataset_test = PairedData('test')
Define generator
class UnetGenerator(nn.Layer):
def __init__(self, input_nc=3, output_nc=3, ngf=64):
super(UnetGenerator, self).__init__()
self.down1 = nn.Conv2D(input_nc, ngf, kernel_size=4, stride=2, padding=1)
self.down2 = Downsample(ngf, ngf*2)
self.down3 = Downsample(ngf*2, ngf*4)
self.down4 = Downsample(ngf*4, ngf*8)
self.down5 = Downsample(ngf*8, ngf*8)
self.down6 = Downsample(ngf*8, ngf*8)
self.down7 = Downsample(ngf*8, ngf*8)
self.center = Downsample(ngf*8, ngf*8)
self.up7 = Upsample(ngf*8, ngf*8, use_dropout=True)
self.up6 = Upsample(ngf*8*2, ngf*8, use_dropout=True)
self.up5 = Upsample(ngf*8*2, ngf*8, use_dropout=True)
self.up4 = Upsample(ngf*8*2, ngf*8)
self.up3 = Upsample(ngf*8*2, ngf*4)
self.up2 = Upsample(ngf*4*2, ngf*2)
self.up1 = Upsample(ngf*2*2, ngf)
self.output_block = nn.Sequential(
nn.ReLU(),
nn.Conv2DTranspose(ngf*2, output_nc, kernel_size=4, stride=2, padding=1),
nn.Tanh()
)
def forward(self, x):
d1 = self.down1(x)
d2 = self.down2(d1)
d3 = self.down3(d2)
d4 = self.down4(d3)
d5 = self.down5(d4)
d6 = self.down6(d5)
d7 = self.down7(d6)
c = self.center(d7)
x = self.up7(c, d7)
x = self.up6(x, d6)
x = self.up5(x, d5)
x = self.up4(x, d4)
x = self.up3(x, d3)
x = self.up2(x, d2)
x = self.up1(x, d1)
x = self.output_block(x)
return x
class Downsample(nn.Layer):
# LeakyReLU => conv => batch norm
def __init__(self, in_dim, out_dim, kernel_size=4, stride=2, padding=1):
super(Downsample, self).__init__()
self.layers = nn.Sequential(
nn.LeakyReLU(0.2),
nn.Conv2D(in_dim, out_dim, kernel_size, stride, padding, bias_attr=False),
nn.BatchNorm2D(out_dim)
)
def forward(self, x):
x = self.layers(x)
return x
class Upsample(nn.Layer):
# ReLU => deconv => batch norm => dropout
def __init__(self, in_dim, out_dim, kernel_size=4, stride=2, padding=1, use_dropout=False):
super(Upsample, self).__init__()
sequence = [
nn.ReLU(),
nn.Conv2DTranspose(in_dim, out_dim, kernel_size, stride, padding, bias_attr=False),
nn.BatchNorm2D(out_dim)
]
if use_dropout:
sequence.append(nn.Dropout(p=0.5))
self.layers = nn.Sequential(*sequence)
def forward(self, x, skip):
x = self.layers(x)
x = paddle.concat([x, skip], axis=1)
return x
Define authenticators
class NLayerDiscriminator(nn.Layer):
def __init__(self, input_nc=6, ndf=64):
super(NLayerDiscriminator, self).__init__()
self.layers = nn.Sequential(
nn.Conv2D(input_nc, ndf, kernel_size=4, stride=2, padding=1),
nn.LeakyReLU(0.2),
ConvBlock(ndf, ndf*2),
ConvBlock(ndf*2, ndf*4),
ConvBlock(ndf*4, ndf*8, stride=1),
nn.Conv2D(ndf*8, 1, kernel_size=4, stride=1, padding=1),
nn.Sigmoid()
)
def forward(self, input):
return self.layers(input)
class ConvBlock(nn.Layer):
# conv => batch norm => LeakyReLU
def __init__(self, in_dim, out_dim, kernel_size=4, stride=2, padding=1):
super(ConvBlock, self).__init__()
self.layers = nn.Sequential(
nn.Conv2D(in_dim, out_dim, kernel_size, stride, padding, bias_attr=False),
nn.BatchNorm2D(out_dim),
nn.LeakyReLU(0.2)
)
def forward(self, x):
x = self.layers(x)
return x
Instantiate the generator , Discriminator
generator = UnetGenerator()
discriminator = NLayerDiscriminator()
out = generator(paddle.ones([1, 3, 256, 256]))
print(' Generator output size :', out.shape)
out = discriminator(paddle.ones([1, 6, 256, 256]))
print(' Discriminator output size :', out.shape)
Define the training parameters
# Hyperparameters
LR = 1e-4
BATCH_SIZE = 8
EPOCHS = 100
# Optimizer
optimizerG = paddle.optimizer.Adam(
learning_rate=LR,
parameters=generator.parameters(),
beta1=0.5,
beta2=0.999)
optimizerD = paddle.optimizer.Adam(
learning_rate=LR,
parameters=discriminator.parameters(),
beta1=0.5,
beta2=0.999)
# Loss function
bce_loss = nn.BCELoss()
l1_loss = nn.L1Loss()
# dataloader
data_loader_train = DataLoader(
paired_dataset_train,
batch_size=BATCH_SIZE,
shuffle=True,
drop_last=True
)
data_loader_test = DataLoader(
paired_dataset_test,
batch_size=BATCH_SIZE
)
Training effect
The first column is cartoons ( Original picture ), The second column is a picture of a real person ( label ), The third column is the result of learning
What you just learned :

100epochs The effect of :

We can see that it has achieved good results
results_save_path = 'work/results'
os.makedirs(results_save_path, exist_ok=True) # Save every epoch Test results
weights_save_path = 'work/weights'
os.makedirs(weights_save_path, exist_ok=True) # Save the model
for epoch in range(EPOCHS):
for data in tqdm(data_loader_train):
real_A, real_B = data
optimizerD.clear_grad()
# D(real)
real_AB = paddle.concat((real_A, real_B), 1)
d_real_predict = discriminator(real_AB)
d_real_loss = bce_loss(d_real_predict, paddle.ones_like(d_real_predict))
# D(fake)
fake_B = generator(real_A).detach()
fake_AB = paddle.concat((real_A, fake_B), 1)
d_fake_predict = discriminator(fake_AB)
d_fake_loss = bce_loss(d_fake_predict, paddle.zeros_like(d_fake_predict))
# train D
d_loss = (d_real_loss + d_fake_loss) / 2.
d_loss.backward()
optimizerD.step()
optimizerG.clear_grad()
# D(fake)
fake_B = generator(real_A)
fake_AB = paddle.concat((real_A, fake_B), 1)
g_fake_predict = discriminator(fake_AB)
g_bce_loss = bce_loss(g_fake_predict, paddle.ones_like(g_fake_predict))
g_l1_loss = l1_loss(fake_B, real_B) * 100.
g_loss = g_bce_loss + g_l1_loss
# train G
g_loss.backward()
optimizerG.step()
print(f'Epoch [{epoch+1}/{EPOCHS}] Loss D: {d_loss.numpy()}, Loss G: {g_loss.numpy()}')
if (epoch+1) % 10 == 0:
paddle.save(generator.state_dict(), os.path.join(weights_save_path, 'epoch'+str(epoch+1).zfill(3)+'.pdparams'))
# test
generator.eval()
with paddle.no_grad():
for data in data_loader_test:
real_A, real_B = data
break
fake_B = generator(real_A)
result = paddle.concat([real_A[:3], real_B[:3], fake_B[:3]], 3)
result = result.detach().numpy().transpose(0, 2, 3, 1)
result = np.vstack(result)
result = (result * 127.5 + 127.5).astype(np.uint8)
cv2.imwrite(os.path.join(results_save_path, 'epoch'+str(epoch+1).zfill(3)+'.png'), result)
generator.train()
test
# Load weights for the generator
last_weights_path = os.path.join(weights_save_path, sorted(os.listdir(weights_save_path))[-1])
print(' Load weights :', last_weights_path)
model_state_dict = paddle.load(last_weights_path)
generator.load_dict(model_state_dict)
generator.eval()
Reading data
test_names = os.listdir('data/cartoon_A2B/test')
# img_name = np.random.choice(test_names)
img_name = '01481.png'
img_A2B = cv2.imread('data/cartoon_A2B/test/'+img_name)
img_A = img_A2B[:, 256:] # Cartoon ( The input )
img_B = img_A2B[:, :256] # Human figure ( That is, the predicted results )
# img_A= cv2.imread('data/test4.png')
# img_A = img_A[:, 256:]
g_input = img_A.astype('float32') / 127.5 - 1 # normalization
g_input = g_input[np.newaxis, ...].transpose(0, 3, 1, 2) # NHWC -> NCHW
g_input = paddle.to_tensor(g_input) # numpy -> tensor
g_output = generator(g_input)
g_output = g_output.detach().numpy() # tensor -> numpy
g_output = g_output.transpose(0, 2, 3, 1)[0] # NCHW -> NHWC
g_output = g_output * 127.5 + 127.5 # Anti normalization
g_output = g_output.astype(np.uint8)
img_show = np.hstack([img_A, g_output])[:,:,::-1]
plt.figure(figsize=(8, 8))
plt.imshow(img_show)
plt.show()
thus , The animation photo live action project has been completed , Most of this project is based on reference projects , Just a few changes .
author : Fast implementation AI idea
| About Deep extension technology |
Shenyan technology was founded in 2018 year 1 month , Zhongguancun High tech enterprise , It is an enterprise with the world's leading artificial intelligence technology AI Service experts . In computer vision 、 Based on the core technology of natural language processing and data mining , The company launched four platform products —— Deep extension intelligent data annotation platform 、 Deep extension AI Development platform 、 Deep extension automatic machine learning platform 、 Deep extension AI Open platform , Provide data processing for enterprises 、 Model building and training 、 Privacy computing 、 One stop shop for Industry algorithms and solutions AI Platform services .
边栏推荐
- 课程设计总结
- Carbon path first, Huawei digital energy injects new momentum into the green development of Guangxi
- Using keras to build the basic model yingtailing flower
- How the idea gradle project imports local jar packages
- PostgreSQL database replication - background first-class citizen process walreceiver receiving and sending logic
- [Transformer]CoAtNet:Marrying Convolution and Attention for All Data Sizes
- Leetcode question brushing series - mode 2 (datastructure linked list) - 160:intersection of two linked list
- go MPG
- [Transformer]On the Integration of Self-Attention and Convolution
- Leetcode classic guide
猜你喜欢

RGB image histogram equalization and visualization matlab code

Acts: efficient test design (with an excellent test design tool)
![[Transformer]CoAtNet:Marrying Convolution and Attention for All Data Sizes](/img/88/041dd30cbc2e47b905ee37a52882a4.jpg)
[Transformer]CoAtNet:Marrying Convolution and Attention for All Data Sizes
![[Transformer]MViTv1:Multiscale Vision Transformers](/img/de/1f2751f08dbf4ea3f77a403365dbda.jpg)
[Transformer]MViTv1:Multiscale Vision Transformers

Visual (Single) Object Tracking -- SiamRPN

An adaptive chat site - anonymous online chat room PHP source code

What are the similarities and differences between the data center and the data warehouse?

Huawei equipment configures local virtual private network mutual access

博途仿真时出现“没有针对此地址组态任何硬件,无法进行修改”解决办法

Yolact paper reading and analysis
随机推荐
Go unit test example; Document reading and writing; serialize
Huawei equipment is configured to access the virtual private network through GRE tunnel
Deep extension technology: intelligent OCR recognition technology based on deep learning has great potential
Ican uses fast r-cnn to get an empty object detection result file
Chia Tai International; What does a master account need to know
World programming language ranking in January 2022
Decision tree (hunt, ID3, C4.5, cart)
An adaptive chat site - anonymous online chat room PHP source code
Paper reproduction: expressive body capture
Sharing of precautions for the construction of dioxin laboratory in Meizhou
[markdown syntax advanced] make your blog more exciting (III: common icon templates)
The 4th small class discussion class on fundamentals of information and communication
Relational database system
KD-Tree and LSH
Batch naming picture names
How to quickly find the official routine of STM32 Series MCU
lower_bound,upper_bound,二分
USB to 232 to TTL overview
What are the similarities and differences between the data center and the data warehouse?
Analysis of hidden dangers in the construction of Fuzhou chemical laboratory