当前位置:网站首页>Tensorflow—Neural Style Transfer
Tensorflow—Neural Style Transfer
2022-07-03 10:27:00 【JallinRichel】
Style transfer [Style_transfer] Learning notes ——Tensorflow
- The actual effect of the tutorial is shown
- Overview of style transfer & The implementation idea of this tutorial
- Preparation
- Code implementation
- Thank you for watching !!!
explain : This article is for the author to learn Tensorflow Study notes during the official tutorial , Now it is sorted out for your reference . You can read and learn this article as the Chinese translation of the official tutorial . The code of this tutorial is consistent with the official code , There are only a few minor changes .
Tensorflow The official tutorial link is attached at the end of the article .
The actual effect of the tutorial is shown
This tutorial will use pictures 1 With pictures 2 Generate pictures 3
picture 1—— Content picture 1

picture 2—— Style picture 2

picture 3—— Result pictures ( Parameter setting epoch=10, step_per_epoch=100)

Overview of style transfer & The implementation idea of this tutorial
Overview of style transfer
The image passes by The covariance matrix of the feature map obtained after convolution layer can well characterize the texture features of the image , But it will lose location information . But in the task of style transfer , We can ignore the disadvantage of location information loss , Just find a way to represent the texture information of the image , And transfer these texture information to the image that needs to be transferred by style , Complete the task of style migration .
In this tutorial , We use Clem matrix instead of covariance matrix , It can describe the autocorrelation of global features .
The implementation idea of this tutorial
In this tutorial , We start from the trained image classification network VGG19 Select the middle layer to extract the texture features of the image .
The official tutorial also introduces the use of Tensorflow_Hub Examples of direct and rapid style migration , In this article, we will not repeat .
Preparation
Software preparation
- install anaconda The latest version 3, Click on Download Select the corresponding version to download ;( It is recommended to use GPU edition )
The author's CPU by Intel Core i5-8300H CPU @ 2.30GHz 2.30 GHz - This tutorial USES Python 3.7.7 edition
- install Tensorflow、Keras、Matplotlib、Numpy
Data preparation ( Download content and style pictures )
content_path = tf.keras.utils.get_file('YellowLabradorLooking_new.jpg', 'https://storage.googleapis.com/download.tensorflow.org/example_images/YellowLabradorLooking_new.jpg')
style_path = tf.keras.utils.get_file('kandinsky5.jpg','https://storage.googleapis.com/download.tensorflow.org/example_images/Vassily_Kandinsky%2C_1913_-_Composition_7.jpg')
Code implementation
Import corresponding modules
import os
import tensorflow as tf
# Load compressed models from tensorflow_hub
os.environ['TFHUB_MODEL_LOAD_FORMAT'] = 'COMPRESSED'
import IPython.display as display
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.figsize'] = (12, 12)
mpl.rcParams['axes.grid'] = False
import numpy as np
import PIL.Image
import time
import functools
def tensor_to_image(tensor): # Define the transformation function from tensor to image
tensor = tensor*255
tensor = np.array(tensor, dtype=np.uint8)
if np.ndim(tensor)>3:
assert tensor.shape[0] == 1
tensor = tensor[0]
return PIL.Image.fromarray(tensor)
Show on the screen that two pictures have been downloaded
- Define a function to load pictures , And limit the maximum size of the picture to 512 Pixels .
def load_img(path_to_img):
max_dim = 512
img = tf.io.read_file(path_to_img)
img = tf.image.decode_image(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.float32)
shape = tf.cast(tf.shape(img)[:-1], tf.float32)
long_dim = max(shape)
scale = max_dim / long_dim
new_shape = tf.cast(shape * scale, tf.int32)
img = tf.image.resize(img, new_shape)
img = img[tf.newaxis, :]
return img
- Define a function to display pictures
def imshow(image, title=None):
if len(image.shape) > 3:
image = tf.squeeze(image, axis=0)
plt.imshow(image)
if title:
plt.title(title)
- Display images
content_image = load_img(content_path)
style_image = load_img(style_path)
plt.subplot(1, 2, 1)
imshow(content_image, 'Content Image')
plt.subplot(1, 2, 2)
imshow(style_image, 'Style Image')
Now two pictures should be displayed on your screen 
Define content and style expression
We can use the middle layer of the model to obtain the content and style of the image .
Start with the input layer of the network , The first few layers can be used to represent low-level features such as edges and textures . When you browse the Internet step by step , The last few layers can be used to represent the high-level features of the image —— Such as wheels or eyes .
In this tutorial , We use VGG19 The middle layer of the network architecture defines the content and style of the image , Try to match the corresponding style and content target representation on these middle layers .
- Load a that does not contain a classification header VGG19, List its layer name
vgg = tf.keras.applications.VGG19(include_top=False, weights='imagenet')
# If you want to load the classification header , You can take the top one False Change it to True
print()
for layer in vgg.layers:
print(layer.name)
- Choose the middle tier from the network to express the content and style of the picture
content_layers = ['block5_conv2']
style_layers = ['block1_conv1',
'block2_conv1',
'block3_conv1',
'block4_conv1',
'block5_conv1']
num_content_layers = len(content_layers)
num_style_layers = len(style_layers)
Why can these intermediate output layers define the content and style representation of pictures
In high-level features , A network wants to realize image classification , Then it must understand the picture . This requires the original image as the input pixel , And create an internal representation , The complex understanding of converting the original image pixels into the corresponding image features .
This is also one reason why convolutional neural network can have better results : They can capture the invariance and defining features of classes that are not affected by background noise and other disturbances .
So when the image is input into the model , At this time, the model acts as a complex feature extractor . By accessing the middle tier of the model , We can describe the content and style of the input image .
Build a model
- The network is designed in tf.keras.applications in , We can go through Keras functional API Extract the value of the middle layer .
- We can use model = Model(inputs, outputs) To define the model .
The following function can build a VGG19 Model
def vgg_layers(layer_names):
""" Creates a vgg model that returns a list of intermediate output values."""
# Load our model. Load pretrained VGG, trained on imagenet data
vgg = tf.keras.applications.VGG19(include_top=False, weights='imagenet')
vgg.trainable = False
outputs = [vgg.get_layer(name).output for name in layer_names]
model = tf.keras.Model([vgg.input], outputs)
return model
style_extractor = vgg_layers(style_layers)
style_outputs = style_extractor(style_image*255)
#Look at the statistics of each layer's output
for name, output in zip(style_layers, style_outputs):
print(name)
print(" shape: ", output.numpy().shape)
print(" min: ", output.numpy().min())
print(" max: ", output.numpy().max())
print(" mean: ", output.numpy().mean())
print()
Computing style ( Calculate the Clem matrix )
By taking the outer product of the eigenvector and itself at each position , And find the average value of the outer product at all positions , Calculate the Clem matrix containing this information .
This can be used tf.linalg.einsum Function to implement
def gram_matrix(input_tensor):
result = tf.linalg.einsum('bijc,bijd->bcd', input_tensor, input_tensor)
input_shape = tf.shape(input_tensor)
num_locations = tf.cast(input_shape[1]*input_shape[2], tf.float32)
return result/(num_locations)
Extract the style and content of the image
Build a model of return style and content tensor
class StyleContentModel(tf.keras.models.Model):
def __init__(self, style_layers, content_layers):
super(StyleContentModel, self).__init__()
self.vgg = vgg_layers(style_layers + content_layers)
self.style_layers = style_layers
self.content_layers = content_layers
self.num_style_layers = len(style_layers)
self.vgg.trainable = False
def call(self, inputs):
"Expects float input in [0,1]"
inputs = inputs*255.0
preprocessed_input = tf.keras.applications.vgg19.preprocess_input(inputs)
outputs = self.vgg(preprocessed_input)
style_outputs, content_outputs = (outputs[:self.num_style_layers],
outputs[self.num_style_layers:])
style_outputs = [gram_matrix(style_output)
for style_output in style_outputs]
content_dict = {
content_name: value
for content_name, value
in zip(self.content_layers, content_outputs)}
style_dict = {
style_name: value
for style_name, value
in zip(self.style_layers, style_outputs)}
return {
'content': content_dict, 'style': style_dict}
extractor = StyleContentModel(style_layers, content_layers)
results = extractor(tf.constant(content_image))
When we input the image , The model will return style_layers Clem matrix and content_layers The content of .
The operating gradient drops
With style and content extractors , We can now run the style migration algorithm program .
By calculating the mean square error of the image output relative to each target , Then take the weighted sum of these losses .
Establish style and content goals
style_targets = extractor(style_image)['style']
content_targets = extractor(content_image)['content']
Define a tf.Variable To include the image to be optimized . In the code before the article, the pixel value of the image has been set to float32 type .(tf.Variable Must have the same shape as the content image )
In order to make the algorithm faster , We limit the pixel value of the picture to 0 To 1 Between
image = tf.Variable(content_image)
def clip_0_1(image):
return tf.clip_by_value(image, clip_value_min=0.0, clip_value_max=1.0)
Build an optimizer , And use the weighted combination of the two losses to get the total loss .
When used in this tutorial Adma, But recommended LBFGS.
opt = tf.optimizers.Adam(learning_rate=0.02, beta_1=0.99, epsilon=1e-1)
style_weight=1e-2
content_weight=1e4
def style_content_loss(outputs):
style_outputs = outputs['style']
content_outputs = outputs['content']
style_loss = tf.add_n([tf.reduce_mean((style_outputs[name]-style_targets[name])**2)
for name in style_outputs.keys()])
style_loss *= style_weight / num_style_layers
content_loss = tf.add_n([tf.reduce_mean((content_outputs[name]-content_targets[name])**2)
for name in content_outputs.keys()])
content_loss *= content_weight / num_content_layers
loss = style_loss + content_loss
return loss
Use tf,GradientTape To update the picture
@tf.function()
def train_step(image):
with tf.GradientTape() as tape:
outputs = extractor(image)
loss = style_content_loss(outputs)
grad = tape.gradient(loss, image)
opt.apply_gradients([(grad, image)])
image.assign(clip_0_1(image))
Now we can try to run our program several times
train_step(image)
train_step(image)
train_step(image)
tensor_to_image(image)
Output results :

Then we run it many times , For better results
import time
start = time.time()
epochs = 10
steps_per_epoch = 100
step = 0
for n in range(epochs):
for m in range(steps_per_epoch):
step += 1
train_step(image)
print(".", end='', flush=True)
display.clear_output(wait=True)
display.display(tensor_to_image(image))
print("Train step: {}".format(step))
end = time.time()
print("Total time: {:.1f}".format(end-start))
If the reader who carries out this step uses CPU Version! , The speed of producing results will be relatively slow , Can be epochs perhaps step_per_epoch Set the value of lower .
Output results :

- This tutorial ends here , In the official course 4 There are subsequent optimizations in , Interested readers can click on the link below .
Thank you for watching !!!
边栏推荐
- Leetcode-513:找树的左下角值
- Leetcode-106: construct a binary tree according to the sequence of middle and later traversal
- 20220606数学:分数到小数
- Stroke prediction: Bayesian
- 20220602 Mathematics: Excel table column serial number
- 20220602数学:Excel表列序号
- LeetCode - 715. Range 模块(TreeSet) *****
- Step 1: teach you to trace the IP address of [phishing email]
- What can I do to exit the current operation and confirm it twice?
- Mise en œuvre d'OpenCV + dlib pour changer le visage de Mona Lisa
猜你喜欢

Are there any other high imitation projects

Opencv feature extraction - hog

Flutter 退出当前操作二次确认怎么做才更优雅?
![[C question set] of Ⅵ](/img/49/eb31cd26f7efbc4d57f17dc1321092.jpg)
[C question set] of Ⅵ

Cases of OpenCV image enhancement

Data preprocessing - Data Mining 1

LeetCode - 715. Range 模块(TreeSet) *****

Implementation of "quick start electronic" window dragging

Standard library header file

Raspberry pie 4B installs yolov5 to achieve real-time target detection
随机推荐
Configure opencv in QT Creator
Policy gradient Method of Deep Reinforcement learning (Part One)
Powshell's set location: unable to find a solution to the problem of accepting actual parameters
Advantageous distinctive domain adaptation reading notes (detailed)
Hands on deep learning pytorch version exercise solution - 2.6 probability
2018 y7000 upgrade hard disk + migrate and upgrade black apple
CV learning notes - camera model (Euclidean transformation and affine transformation)
LeetCode - 900. RLE 迭代器
Leetcode interview question 17.20 Continuous median (large top pile + small top pile)
20220602数学:Excel表列序号
Leetcode-106:根据中后序遍历序列构造二叉树
ECMAScript--》 ES6语法规范 ## Day1
Markdown latex full quantifier and existential quantifier (for all, existential)
2.2 DP: Value Iteration & Gambler‘s Problem
CV learning notes - feature extraction
Codeup: word replacement
Leetcode bit operation
【毕业季】图匮于丰,防俭于逸;治不忘乱,安不忘危。
CV learning notes - image filter
波士顿房价预测(TensorFlow2.9实践)