当前位置：网站首页>Method summary of creating deep learning model with keras/tensorflow 2.9

Method summary of creating deep learning model with keras/tensorflow 2.9

2022-06-11 07:54:00 【Autumn moon on Pinghu Lake in Hangzhou】

List of articles

Preface
When creating a model 3 Principles
1. Use function API To create a model
2. Do not use Numpy function
3. Use subclass method subclassing keras.layers.Layer Create a custom layer
- 3.1 Put the complicated block Encapsulated in a custom layer
- 3.2 hold eager tensor Encapsulated in a custom layer

Preface

Keras There is 3 There are three ways to create a deep learning model ：1. keras.Sequential Model . 2. Functional expression API.3. Subclass method subclassing keras.Model.
Because there are many methods available , Flexibility , It is easy to make mistakes in actual use .
But if the goal of modeling , It is a deep learning model that can express any structure , And you can see the complete internal structure of the model , When creating the model , Yes 3 A simple principle can be used .

When creating a model 3 Principles

Use function functional API To create a model , namely model = keras.Model(inputs=…).
Do not use Numpy function .
For individual custom layers , Use subclass method subclassing keras.layers.Layer To create .

This is discussed in detail below 3 Principles .

1. Use function API To create a model

Use function API Model created , You can see the complete model structure . And the subclass method subclassing keras.Model Model created , You cannot see the internal structure , Because it creates a black box . As for using keras.Sequential Model created , Because it is a linear structure , Cannot be used to express complex models .
So most of the time , Use function API Just create the model .

Use function API The form of modeling is ：model = keras.Model(inputs=…). An example is as follows .

inputs = keras.Input(shape=(608, 608, 3))

x = keras.layers.Conv2D(16, 3)(inputs)
x = keras.layers.BatchNormalization()(x)
outputs  = keras.layers.LeakyReLU()(x)

funtional_api_model = keras.Model(inputs=inputs, outputs=outputs, 
                                  name='demo_funtional_api_model')

2. Do not use Numpy function

Using function formula API When you create a model , Do not use Numpy function . If you need some Numpy The function of , It can be used TF Instead of , For example, use tf.reduce_max Instead of np.amax, use tf.concat Instead of np.concatenate etc. .
They can't be used Numpy Cause of the function , It's because ：
Keras By default, the deep learning model is run in the way of static calculation diagram . When creating a calculation chart , Want to use KerasTensor, This is a TensorFlow A unique type of tensor ,Numpy I don't know how to deal with KerasTensor.
KerasTensor Also known as the sign tensor symbolic tensor, It's the first 0 The dimension size is None. Here's the picture .
Insert picture description here
If... Is used in modeling Numpy function , The following error will be reported ：
"You are passing KerasTensor ..., an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers ... Other APIs cannot be called directly on symbolic Kerasinputs/outputs ..."
Sometimes there is another error report ：
"NotImplementedError: Cannot convert a symbolic tf.Tensor (Placeholder:0) to a numpy array ..."

3. Use subclass method subclassing keras.layers.Layer Create a custom layer

When you create a model , priority of use TF Built in layer of , Include keras.layers and tfa.layers . For example, use keras.layers.Reshape Layer instead of using tf.reshape, Use tfa.layers.GELU Instead of tf.nn.gelu. Use TF The benefits of the built-in layer are ： It is convenient to name each layer , And use keras.utils.plot_model The structure drawing is also more beautiful .

But in some cases , You need to use a custom layer . More common 2 The following is the case .

3.1 Put the complicated block Encapsulated in a custom layer

Create a deep learning model , It's like building blocks , Various building blocks will be used block, such as Conv2D–Batch Normalization–Mish It's a common one block.
If you put block Make a function , You can see in the model block The complete structure of . But if you want to hide block The internal structure of , Make a black box , You can use a custom layer to implement . An example is as follows .

class DemoSubclassingLayer(keras.layers.Layer):
    """ In the form of a black box  block. For demonstration purposes only  keras.layers.Layer  Usage of . Attributes: conv_filters_1:  An integer , It's No  1  Number of filters per convolution layer . conv_filters_2:  An integer , It's No  2  Number of filters per convolution layer . dropout_rate:  A floating point number , yes  Dropout  Layer of  dropout  The proportion . conv_1:  yes  block  No  1  Convolution layers . conv_2:  yes  block  No  2  Convolution layers . batch_norm :  yes  block  Medium  BatchNormalization  layer . """
	
    def __init__(self, filters_1, filters_2, dropout_rate, **kwargs):
        super().__init__(**kwargs)
        self.conv_filters_1 = filters_1
        self.conv_filters_2 = filters_2
        self.dropout_rate = dropout_rate
        #  Define all layers containing trainable parameters as attributes .
        self.conv_1 = keras.layers.Conv2D(filters=self.conv_filters_1, kernel_size=3)
        self.conv_2 = keras.layers.Conv2D(filters=self.conv_filters_2, kernel_size=3)
        self.batch_norm = keras.layers.BatchNormalization()
    
    def call(self, inputs, training=None):
        #  stay  call  The forward propagation process of the network is partially defined .
        x = self.conv_1(inputs)
        x = self.conv_2(x)
        
        #  In the current deep learning model ,Dropout  and  BatchNormalization, Is the only one  2  It is necessary to use  training  Layer of parameters .training  It's a Boolean value , Used to  
        #  Distinguish between training mode and reasoning mode  inference mode.Dropout  and  BatchNormalization  There will be different performance in training mode and reasoning mode  behaviour.
        x = self.batch_norm(x, training=training)
        
        x = tfa.activations.mish(x)  # mish,ReLU  Wait until there are no trainable parameters , So there's no need to  __init__  Set as property in .
        
        # Dropout  Layer has no trainable parameters , So there's no need to  __init__  Set as property in .
        x = keras.layers.Dropout(rate=self.dropout_rate)(x, training=training)  
        
        return x  #  The result of the calculation must be used in  return  return .
    
    def get_config(self):
        config = super().get_config()
        #  stay  get_config  part , Put the custom value , String, etc. are added to the dictionary  config  in , and  TF  Built in functions and layers do not need to be added .
        #  Pay attention to the dictionary  config  Of  key  Should be  __init__  Parameter name in , and  config  Of  value  Is the attribute corresponding to the parameter .
        config.update({
    
            'filters_1': self.conv_filters_1,
            'filters_2': self.conv_filters_2,
            'dropout_rate': self.dropout_rate,
        })
        #  There must be the following  return config  sentence , Otherwise, when the model is loaded later , Cannot get current class  DemoSubclassingLayer  Parameter values for .
        return config

When using custom layers , Mainly used 3 A way ：init(),call() and get_config(). in addition 2 A way build() and from_config(), Generally, it is not necessary to use .

init() part , All the trainable parameters must be parameters, Defined as an attribute .
The trainable parameters are 2 Sources , The first source is some layers with parameters , Include BatchNormalization, Conv2D etc. . The second source is trainable variables tf.Variable(trainable=True).
The purpose of this , Is to record these trainable parameters , In the process of training TensorFlow Calculate their gradients , And back propagation , Keep updating these parameters .
Pay special attention to 2 spot ：
1.1 If a layer of a certain class is used many times in the model , Then the layer must be defined as multiple attributes .
As used in the above example 2 Secondary convolution , It's defined as 2 Attributes self.conv_1 and self.conv_2 .
1.2 Layers without parameters , You do not need to set it as a property . As in the example above mish、ReLU etc. .
call() part . Realize the forward propagation of the network , That is, the superposition of various layers .

get_config() part , Put the custom value , String, etc. are added to the dictionary config in , and TF Built in functions and layers do not need to be added .
get_config It is mainly used to obtain the parameter values of the current user-defined layer , And will be saved with the model , namely serialization.

If the model has been saved on the hard disk before , When loading the model , You can use the following methods .

saved_model_path = 'saved_model.h5'  #  Model save path .

#  Put all the custom layers , Self defined loss function and self-defined index, etc , in  custom_objects  in .
custom_objects = {
      'DemoSubclassingLayer': DemoSubclassingLayer}
#  When loading the model , There is no need to provide  DemoSubclassingLayer  Parameter values for , Because these parameter values have been  get_config()  Automatically saved on the hard disk ,
#  also  Keras  These parameter values will be automatically provided to  DemoSubclassingLayer.
saved_model = keras.models.load_model(saved_model_path, custom_objects=custom_objects)

When creating a custom layer , If in call() Section creates a new parameter （ For example, created Conv2D And other layers with parameters , Or create tf.Variable(Trainable=True…) Variable ）, Instead of putting them as attributes init part , In training the model , An error message will appear ：
tf.function only supports singleton tf.Variables created on the first call. Make sure the tf.Variable is only created once or created outside tf.function.
Sometimes it is another error message ：
tf.function-decorated function tried to create variables on non-first call.
These two types of error reporting mean the same thing ： That is, do not call() Part create trainable parameters . Otherwise, during training , Every epoch Will be called call() once , and call() Some parameters are also created again and again , As a result, back propagation cannot be applied to call() Part of the parameter .

If you use a custom layer DemoSubclassingLayer Creating models , Reuse plot_model Draw the structure ：

inputs = keras.Input(shape=(608, 608, 3))
outputs = DemoSubclassingLayer(filters_1=8, filters_2=32, dropout_rate=0.3, name='black_box_layer')(inputs)
demo_model_subclassing_layer = keras.Model(inputs=inputs, outputs=outputs, name='demo_model_subclassing_layer')

keras.utils.plot_model(demo_model_subclassing_layer, show_shapes=True, to_file='demo_model_subclassing_layer.png')

The structure diagram is as follows . You can see the custom layer DemoSubclassingLayer It's totally a black box , Although there are multiple layers , But all the layers are invisible .
Insert picture description here

3.2 hold eager tensor Encapsulated in a custom layer

When you create a model , If used eager tensor（ for example tf.range() , etc. ）, It should be encapsulated in keras.layers.Layer Inside . Otherwise, the model structure cannot be generated , It doesn't work keras.utils.plot_model Draw a structural diagram , And an error is reported during modeling ：
AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute '_keras_history'

for instance ： stay Vision Transformer In the model , Need to use position encoding, You usually use tf.range() To generate position encoding. here , This tf.range() It will generate a eager tensor, Need to put tf.range() Package to keras.layers.Layer Of call part . The sample code is as follows ：

    #  Only the custom layer is shown here  PositionEncoding  Of  call  part .
    def call(self, inputs):
        # positions  Shape is  (41209,).
        positions = tf.range(self.patches_quantity)
        # positions  Shape is  (1, 41209). Must use  tf.newaxis  To  2D  tensor , Follow up  Embedding  Layer to get  3D  tensor .
        position_encoding = positions[tf.newaxis, :]
        # embedded_positions  Shape is  (1, 41209, output_dim).
        embedded_positions = self.position_embeddings(position_encoding)

        return embedded_positions