当前位置：网站首页>Super parameter tuning of neural network using keras tuner

Super parameter tuning of neural network using keras tuner

2022-06-24 01:53:00 【Panchuang AI】

Pan Chuang AI Share

author | AYUSH3987

compile | Flin

source | analyticsvidhya

Introduce

In the neural network , We have a lot of super parameters , It is very difficult to manually adjust the super parameters . therefore , We can use Keras Tuner, This makes it very easy to adjust the super parameters of the neural network . Just like the grid search or random search you see in machine learning .

In this paper , Will you learn how to use it Keras Tuner Adjust the super parameters of neural network , We will start with a very simple neural network , Then the super parameters are adjusted and the results are compared . You will learn about Keras Tuner All the information about .

What is a super parameter ？

Developing a deep learning model is an iterative process , Start with the initial architecture , And then reconfigure , Until you get a model that can be trained effectively in terms of time and computing resources .

These settings you adjust are called super parameters , You have an idea , Write code and see performance , Then perform the same process again , Until good performance is achieved .

therefore , There is a way to adjust the settings of the neural network , It's called super parameter , The process of finding a good set of super parameters is called super parameter adjustment .

Superparameter tuning is a very important part of building , If you don't finish , It may lead to major problems in the model , For example, spend a lot of time 、 Useless parameters, etc .

Hyperparameters There are usually two types ：

Model based hyperparameters ： These types of super parameters include the number of hidden layers 、 Neurons, etc .
Based on Algorithms ： These types affect speed and efficiency , For example, the learning rate in gradient descent .

For more complex models , The number of super parameters will increase dramatically , Manually adjusting them can be very challenging .

Keras The benefit of the tuner is , It will help accomplish one of the most challenging tasks , That is, it is very easy to tune the super parameters with just a few lines of code .

Keras tuner

Keras tuner It is a library for adjusting the super parameters of neural network , Can help you in Tensorflow To select the best super parameter in the implementation of neural network .

To install Keras tuner, You just need to run the following command ,

pip install keras-tuner

But wait. ！, Why do we need Keras tuner？

The answer is , Hyperparameters play an important role in developing a good model , It can make a big difference , It will help you prevent over fitting , It will help you make a good trade-off between bias and variance , wait .

Use Keras Tuner Adjust our super parameters

First , We will develop a baseline model , Then we will use Keras tuner To develop our model . I will use Tensorflow To implement .

step 1（ Download and prepare datasets ）

from tensorflow import keras # importing keras
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data() # loading the data using keras datasets api
x_train = x_train.astype('float32') / 255.0 # normalize the training images
x_test = x_test.astype('float32') / 255.0 # normalize the testing images

step 2（ Develop baseline models ）

Now? , We will use... To help identify numbers mnist Data sets build our baseline neural network , So let's build a deep neural network .

model1 = keras.Sequential()
model1.add(keras.layers.Flatten(input_shape=(28, 28))) # flattening 28 x 28 
model1.add(keras.layers.Dense(units=512, activation='relu', name='dense_1')) # you have 512 neurons with relu activation
model1.add(keras.layers.Dropout(0.2)) # we added a dropout layer with the rate of 0.2
model1.add(keras.layers.Dense(10, activation='softmax')) # output layer, where we have total 10 classes

step 3（ Compiling and training models ）

Now? , We have established our baseline model , Now it's time to compile our model and train it , We will use Adam Optimizer , The learning rate is 0.0, For training , We will run our model 10 Period , Verify split into 0.2 .

model1.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
            loss=keras.losses.SparseCategoricalCrossentropy(),
            metrics=['accuracy'])

model1.fit(x_train, y_train, epochs=10, validation_split=0.2)

step 4（ Evaluate our model ）

Now we have trained , Now we will evaluate our model on the test set , Look at the performance of the model .

model1_eval = model.evaluate(img_test, label_test, return_dict=True)

Use Keras Tuner Adjust the model

step 1（ Import library ）

import tensorflow as tf
import kerastuner as kt

step 2（ Use Keras Tuner Build the model ）

Now? , You will set up a supermodel （ The model you set for overshoot is called a supermodel ）, We will use model builder functions to define your supermodel , You can see in the following function that this function returns the compilation model with adjusted super parameters .

In the following classification model , We will fine tune the model hyperparameters , That is, several neurons and Adam The learning rate of the optimizer .

def model_builder(hp):
  '''
  Args:
    hp - Keras tuner object
  '''
  # Initialize the Sequential API and start stacking the layers
  model = keras.Sequential()
  model.add(keras.layers.Flatten(input_shape=(28, 28)))
  # Tune the number of units in the first Dense layer
  # Choose an optimal value between 32-512
  hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
  model.add(keras.layers.Dense(units=hp_units, activation='relu', name='dense_1'))
  # Add next layers
  model.add(keras.layers.Dropout(0.2))
  model.add(keras.layers.Dense(10, activation='softmax'))
  # Tune the learning rate for the optimizer
  # Choose an optimal value from 0.01, 0.001, or 0.0001
  hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
  model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
                loss=keras.losses.SparseCategoricalCrossentropy(),
                metrics=['accuracy'])
  return model

In the code above , Here are some notes ：

Int() Method to define the search space for dense cells . This allows you to set the minimum and maximum values as well as the step size when incrementing between these values .
Learning rate Choice() Method . This allows you to define discrete values to be included in the search space when overshoot .

step 3 Instantiation tuner And adjust the super parameters

You will use HyperBand Tuner, It is an algorithm developed for hyperparametric optimization . It uses adaptive resource allocation and early stop to quickly converge to high-performance models .

You can be here (https://arxiv.org/pdf/1603.06560.pdf) Read more about this intuition .

But the basic algorithm is shown in the figure below , If you don't understand , Please ignore it and move on . This is a big story that needs another blog .

Hyperband By calculation 1 + log_factor(max_epochs) And round it to the nearest integer to determine the number of models to be trained in brackets .

# Instantiate the tuner
tuner = kt.Hyperband(model_builder, # the hypermodel
                     objective='val_accuracy', # objective to optimize
max_epochs=10,
factor=3, # factor which you have seen above 
directory='dir', # directory to save logs 
project_name='khyperband')

# hypertuning settings
tuner.search_space_summary() 
Output:- 

# Search space summary
# Default search space size: 2
# units (Int)
# {'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
# learning_rate (Choice)
# {'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}

step 4（ Search for the best superparameter ）

stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
# Perform hypertuning
tuner.search(x_train, y_train, epochs=10, validation_split=0.2, callbacks=[stop_early])

best_hp=tuner.get_best_hyperparameters()[0]

step 5（ Use the best hyperparametric reconstruction and training model ）

# Build the model with the optimal hyperparameters
h_model = tuner.hypermodel.build(best_hps)
h_model.summary()
h_model.fit(x_train, x_test, epochs=10, validation_split=0.2)

Now? , You can evaluate this model ,

h_eval_dict = h_model.evaluate(img_test, label_test, return_dict=True)

Comparison between using and not using hyperparametric tuning

Baseline model performance ：

BASELINE MODEL:
number of units in 1st Dense layer: 512
learning rate for the optimizer: 0.0010000000474974513
loss: 0.08013473451137543
accuracy: 0.9794999957084656

HYPERTUNED MODEL:
number of units in 1st Dense layer: 224
learning rate for the optimizer: 0.0010000000474974513
loss: 0.07163219898939133
accuracy: 0.979200005531311

If you see that the training time of the baseline model is more than that of the hyperparametric adjustment model , That's because it has fewer neurons , So it's faster .
More robust hyperparametric model , You can see the loss of your baseline model and the loss of your overshoot model , So we can say that this is a more robust model .

Endnote

Thank you for reading this article , I hope you find this article very helpful , And you will implement it in your neural network Keras tuner To get a better neural network .

原网站

版权声明
本文为[Panchuang AI]所创，转载请带上原文链接，感谢
https://yzsam.com/2021/11/20211110191147144m.html