当前位置：网站首页>Self encoder AE (autoencoder) program

Self encoder AE (autoencoder) program

2022-07-26 14:40:00 【The way of code】

1. The program on

（1） Vanilla encoder

In the simplest structure of this self encoder , There are only three network layers , That is, a neural network with only one hidden layer . Its input and output are the same , By using Adam Optimizer and mean square error loss function , To learn how to refactor input .

ad locum , If Hidden layer dimension （64） Less than the input dimension （784）, The encoder is said to be lossy . Through this constraint , To force the neural network to learn the compressed representation of data .

input_size = 784
hidden_size = 64
output_size = 784

x = Input(shape=(input_size,))

# Encoder
h = Dense(hidden_size, activation='relu')(x)

# Decoder
r = Dense(output_size, activation='sigmoid')(h)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

Dense：Keras Dense layer ,keras.layers.core.Dense( units, activation=None）

units, # Represents the output dimension of the layer

activation=None, # Activation function . But by default liner

Activation： The activation layer applies an activation function to the output of a layer

model.compile() ：Model Model method One of ：compile

optimizer： Optimizer , For predefined optimizer names or optimizer objects , Reference resources Optimizer

loss： Loss function , For predefined loss function name or an objective function , Reference resources Loss function

adam：adaptive moment estimation, It's right RMSProp Update of optimizer . The learning rate of each parameter is dynamically adjusted by using the first-order moment estimation and the second-order moment estimation of the gradient . advantage ： Each iteration has a clear range of learning rates , Make the parameter change very stable .

mse：mean_squared_error, Mean square error

（2） Multi layer self encoder

If a hidden layer is not enough , Obviously, the number of hidden layers of the automatic encoder can be further increased .

ad locum , The implementation uses 3 Hidden layers , Not just one . Any hidden layer can be used as a feature representation , But to make the network symmetrical , We used the middle network layer .

input_size = 784
hidden_size = 128
code_size = 64

x = Input(shape=(input_size,))

# Encoder
hidden_1 = Dense(hidden_size, activation='relu')(x)
h = Dense(code_size, activation='relu')(hidden_1)

# Decoder
hidden_2 = Dense(hidden_size, activation='relu')(h)
r = Dense(input_size, activation='sigmoid')(hidden_2)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

（3） Convolutional self encoder

Except for the full connection layer , Self encoder can also be applied to convolution layer , The principle is the same , however To use 3D vector （ Such as images ） Not the flattened one-dimensional vector . On the input image Down sampling , To provide a potential representation of smaller dimensions , To force the self encoder to learn from the compressed data .

x = Input(shape=(28, 28,1)) 

# Encoder
conv1_1 = Conv2D(16, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool1)
pool2 = MaxPooling2D((2, 2), padding='same')(conv1_2)
conv1_3 = Conv2D(8, (3, 3), activation='relu', padding='same')(pool2)
h = MaxPooling2D((2, 2), padding='same')(conv1_3)

# Decoder
conv2_1 = Conv2D(8, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(8, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
conv2_3 = Conv2D(16, (3, 3), activation='relu')(up2)
up3 = UpSampling2D((2, 2))(conv2_3)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up3)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

conv2d：Conv2D(filters, kernel_size, strides=(1, 1), padding='valid'）

filters： The number of convolution kernels （ The dimension of output ）.

kernel_size： The width and length of convolution kernel , A single integer or two integers list/tuple. If it is a single integer , It means the same length in each spatial dimension .

strides： The step size of convolution , A single integer or two integers list/tuple. If it is a single integer , Represents the same step size in each spatial dimension . Anything not for 1 Of strides All with any not for 1 Of dilation_rate Are not compatible .

padding： repair 0 Strategy , Yes “valid”, “same” Two kinds of .“valid” Represents only effective convolution , That is, the boundary data is not processed .“same” Represents the convolution result at the reserved boundary , It usually leads to output shape With the input shape identical .

MaxPooling2D：2D The maximum pool level of the input .MaxPooling2D(pool_size=(2, 2), strides=None, border_mode='valid')

pool_size：pool_size： Long for 2 The integer of tuple, It's in two directions （ vertical , level ） The downsampling factor on , If you take （2,2） Will make the picture half the original length in both dimensions . strides： Long for 2 The integer of tuple, perhaps None, Step value . padding: character string ,“valid” perhaps ”same”.

UpSampling2D： On the sampling .UpSampling2D(size=(2, 2))

size： Integers tuple, Sampling factors on rows and columns, respectively .

（4） Regular self encoder

In addition to imposing a hidden layer smaller than the input dimension , Some other methods can also be used to constrain self encoder reconfiguration , Such as regular self encoder .

Regular self encoder does not need to use shallow encoder and decoder and small coding dimension to limit model capacity , But use Loss function To encourage the model to learn other features （ In addition to copying input to output ）. These features include sparse representation 、 Small derivative characterizes 、 And robustness to noise or input loss .

Even if the model capacity is large enough to learn a meaningless identity function , The non-linear and over complete regular self encoder can still learn some useful information about the data distribution from the data .

in application , Two kinds of regular self encoders are commonly used , Namely Sparse self encoder and Noise reduction self encoder .

（5） Sparse self encoder

Generally used to learn characteristics , For tasks like classification . Sparse regularized self encoder must reflect the unique statistical characteristics of training data set , Instead of simply acting as an identity function . Train in this way , Performing the recurrence task with sparse penalty can get a model that can learn useful features .

There is also a way to constrain the reconstruction of the automatic encoder , It's about constraining its loss function . such as , But for The loss function adds a regularization constraint , In this way, the self encoder can learn the sparse representation of data .

it is to be noted that , In the hidden layer , We also joined in L1 Regularization , As the penalty term of loss function in optimization stage . Compared with vanilla self encoder , In this way, the data representation after operation is more sparse .

input_size = 784
hidden_size = 64
output_size = 784

x = Input(shape=(input_size,))

# Encoder
h = Dense(hidden_size, activation='relu', activity_regularizer=regularizers.l1(10e-5))(x)
# Applied to the output L1 The regularization 

# Decoder
r = Dense(output_size, activation='sigmoid')(h)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

activity_regularizer： A regular term applied to the output , by ActivityRegularizer object

**l1(l=0.01)**：L1 The regularization , The regular term is usually used to impose some constraints on the training of the model ,L1 The regular term is L1 Norm constraint , This constraint causes the constrained matrix to / Vectors are more sparse .

（6） Noise reduction self encoder

This is not by imposing penalty terms on the loss function , It is Learn some useful information by changing the reconstruction error term of the loss function .

Add noise to the training data , And make the self encoder learn to remove this kind of noise to obtain the real input which has not been polluted by noise . therefore , This forces the encoder to learn to extract the most important features and learn more robust representations from the input data , This is also its generalization ability Better than ordinary encoder Why .

This structure can be trained by gradient descent algorithm .

x = Input(shape=(28, 28, 1))

# Encoder
conv1_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(x)
pool1 = MaxPooling2D((2, 2), padding='same')(conv1_1)
conv1_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(pool1)
h = MaxPooling2D((2, 2), padding='same')(conv1_2)

# Decoder
conv2_1 = Conv2D(32, (3, 3), activation='relu', padding='same')(h)
up1 = UpSampling2D((2, 2))(conv2_1)
conv2_2 = Conv2D(32, (3, 3), activation='relu', padding='same')(up1)
up2 = UpSampling2D((2, 2))(conv2_2)
r = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(up2)

autoencoder = Model(input=x, output=r)
autoencoder.compile(optimizer='adam', loss='mse')

2. Program instance ：

（1） Single layer self encoder

from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
 
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)
 
 # Single layer self encoder 
encoding_dim = 32
input_img = Input(shape=(784,))
 
encoded = Dense(encoding_dim, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=encoded)
 
encoded_input = Input(shape=(encoding_dim,))
decoder_layer = autoencoder.layers[-1]
 
decoder = Model(inputs=encoded_input, outputs=decoder_layer(encoded_input))
 
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, 
                shuffle=True, validation_data=(x_test, x_test))
 
encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
 
 # Output image 
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

（2） Convolutional self encoder

from keras.layers import Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from keras.callbacks import TensorBoard

(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
print(x_train.shape)
print(x_test.shape)


# Convolutional self encoder 
input_img = Input(shape=(28, 28, 1))
 
x = Convolution2D(16, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
 
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(8, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(16, (3, 3), activation='relu')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
#  Open a terminal and start TensorBoard, Input... In the terminal  tensorboard --logdir=/autoencoder
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256,
                shuffle=True, validation_data=(x_test, x_test),
                callbacks=[TensorBoard(log_dir='autoencoder')])
 
decoded_imgs = autoencoder.predict(x_test)


# Output image 
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

（3） Depth auto encoder

from keras.layers import Input, Dense
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt

(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))
print(x_train.shape)
print(x_test.shape)



# Depth auto encoder 
input_img = Input(shape=(784,))
encoded = Dense(128, activation='relu')(input_img)
encoded = Dense(64, activation='relu')(encoded)
decoded_input = Dense(32, activation='relu')(encoded)
 
decoded = Dense(64, activation='relu')(decoded_input)
decoded = Dense(128, activation='relu')(decoded)
decoded = Dense(784, activation='sigmoid')(encoded)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
encoder = Model(inputs=input_img, outputs=decoded_input)
 
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, 
                shuffle=True, validation_data=(x_test, x_test))
 
encoded_imgs = encoder.predict(x_test)
decoded_imgs = autoencoder.predict(x_test)



# Output image 
n = 10  # how many digits we will display
plt.figure(figsize=(20, 4))
for i in range(n):
    ax = plt.subplot(2, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(2, n, i + 1 + n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

（4） Noise reduction self encoder

from keras.layers import Input, Convolution2D, MaxPooling2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from keras.callbacks import TensorBoard
 
(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = np.reshape(x_train, (len(x_train), 28, 28, 1))
x_test = np.reshape(x_test, (len(x_test), 28, 28, 1))
noise_factor = 0.5
x_train_noisy = x_train + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_train.shape) 
x_test_noisy = x_test + noise_factor * np.random.normal(loc=0.0, scale=1.0, size=x_test.shape) 
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)
print(x_train.shape)
print(x_test.shape)
 
input_img = Input(shape=(28, 28, 1))
 
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(input_img)
x = MaxPooling2D((2, 2), padding='same')(x)
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(x)
encoded = MaxPooling2D((2, 2), padding='same')(x)
 
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(encoded)
x = UpSampling2D((2, 2))(x)
x = Convolution2D(32, (3, 3), activation='relu', padding='same')(x)
x = UpSampling2D((2, 2))(x)
decoded = Convolution2D(1, (3, 3), activation='sigmoid', padding='same')(x)
 
autoencoder = Model(inputs=input_img, outputs=decoded)
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
 
#  Open a terminal and start TensorBoard, Input... In the terminal  tensorboard --logdir=/autoencoder
autoencoder.fit(x_train_noisy, x_train, epochs=10, batch_size=256,
                shuffle=True, validation_data=(x_test_noisy, x_test),
                callbacks=[TensorBoard(log_dir='autoencoder', write_graph=False)])
 
decoded_imgs = autoencoder.predict(x_test_noisy)
 
n = 10
plt.figure(figsize=(30, 6))
for i in range(n):
    ax = plt.subplot(3, n, i + 1)
    plt.imshow(x_test[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
    
    ax = plt.subplot(3, n, i + 1 + n)
    plt.imshow(x_test_noisy[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
 
    ax = plt.subplot(3, n, i + 1 + 2*n)
    plt.imshow(decoded_imgs[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()