当前位置:网站首页>Super parameter tuning of neural network using keras tuner
Super parameter tuning of neural network using keras tuner
2022-06-24 01:53:00 【Panchuang AI】
Pan Chuang AI Share
author | AYUSH3987
compile | Flin
source | analyticsvidhya
Introduce
In the neural network , We have a lot of super parameters , It is very difficult to manually adjust the super parameters . therefore , We can use Keras Tuner, This makes it very easy to adjust the super parameters of the neural network . Just like the grid search or random search you see in machine learning .
In this paper , Will you learn how to use it Keras Tuner Adjust the super parameters of neural network , We will start with a very simple neural network , Then the super parameters are adjusted and the results are compared . You will learn about Keras Tuner All the information about .
What is a super parameter ?
Developing a deep learning model is an iterative process , Start with the initial architecture , And then reconfigure , Until you get a model that can be trained effectively in terms of time and computing resources .
These settings you adjust are called super parameters , You have an idea , Write code and see performance , Then perform the same process again , Until good performance is achieved .
therefore , There is a way to adjust the settings of the neural network , It's called super parameter , The process of finding a good set of super parameters is called super parameter adjustment .
Superparameter tuning is a very important part of building , If you don't finish , It may lead to major problems in the model , For example, spend a lot of time 、 Useless parameters, etc .
Hyperparameters There are usually two types :
- Model based hyperparameters : These types of super parameters include the number of hidden layers 、 Neurons, etc .
- Based on Algorithms : These types affect speed and efficiency , For example, the learning rate in gradient descent .
For more complex models , The number of super parameters will increase dramatically , Manually adjusting them can be very challenging .
Keras The benefit of the tuner is , It will help accomplish one of the most challenging tasks , That is, it is very easy to tune the super parameters with just a few lines of code .
Keras tuner
Keras tuner It is a library for adjusting the super parameters of neural network , Can help you in Tensorflow To select the best super parameter in the implementation of neural network .
To install Keras tuner, You just need to run the following command ,
pip install keras-tuner
But wait. !, Why do we need Keras tuner?
The answer is , Hyperparameters play an important role in developing a good model , It can make a big difference , It will help you prevent over fitting , It will help you make a good trade-off between bias and variance , wait .
Use Keras Tuner Adjust our super parameters
First , We will develop a baseline model , Then we will use Keras tuner To develop our model . I will use Tensorflow To implement .
step 1( Download and prepare datasets )
from tensorflow import keras # importing keras
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data() # loading the data using keras datasets api
x_train = x_train.astype('float32') / 255.0 # normalize the training images
x_test = x_test.astype('float32') / 255.0 # normalize the testing images
step 2( Develop baseline models )
Now? , We will use... To help identify numbers mnist Data sets build our baseline neural network , So let's build a deep neural network .
model1 = keras.Sequential() model1.add(keras.layers.Flatten(input_shape=(28, 28))) # flattening 28 x 28 model1.add(keras.layers.Dense(units=512, activation='relu', name='dense_1')) # you have 512 neurons with relu activation model1.add(keras.layers.Dropout(0.2)) # we added a dropout layer with the rate of 0.2 model1.add(keras.layers.Dense(10, activation='softmax')) # output layer, where we have total 10 classes
step 3( Compiling and training models )
Now? , We have established our baseline model , Now it's time to compile our model and train it , We will use Adam Optimizer , The learning rate is 0.0, For training , We will run our model 10 Period , Verify split into 0.2 .
model1.compile(optimizer=keras.optimizers.Adam(learning_rate=0.001),
loss=keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
model1.fit(x_train, y_train, epochs=10, validation_split=0.2)
step 4( Evaluate our model )
Now we have trained , Now we will evaluate our model on the test set , Look at the performance of the model .
model1_eval = model.evaluate(img_test, label_test, return_dict=True)
Use Keras Tuner Adjust the model
step 1( Import library )
import tensorflow as tf import kerastuner as kt
step 2( Use Keras Tuner Build the model )
Now? , You will set up a supermodel ( The model you set for overshoot is called a supermodel ), We will use model builder functions to define your supermodel , You can see in the following function that this function returns the compilation model with adjusted super parameters .
In the following classification model , We will fine tune the model hyperparameters , That is, several neurons and Adam The learning rate of the optimizer .
def model_builder(hp):
'''
Args:
hp - Keras tuner object
'''
# Initialize the Sequential API and start stacking the layers
model = keras.Sequential()
model.add(keras.layers.Flatten(input_shape=(28, 28)))
# Tune the number of units in the first Dense layer
# Choose an optimal value between 32-512
hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
model.add(keras.layers.Dense(units=hp_units, activation='relu', name='dense_1'))
# Add next layers
model.add(keras.layers.Dropout(0.2))
model.add(keras.layers.Dense(10, activation='softmax'))
# Tune the learning rate for the optimizer
# Choose an optimal value from 0.01, 0.001, or 0.0001
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])
model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
loss=keras.losses.SparseCategoricalCrossentropy(),
metrics=['accuracy'])
return model
In the code above , Here are some notes :
- Int() Method to define the search space for dense cells . This allows you to set the minimum and maximum values as well as the step size when incrementing between these values .
- Learning rate Choice() Method . This allows you to define discrete values to be included in the search space when overshoot .
step 3 Instantiation tuner And adjust the super parameters
You will use HyperBand Tuner, It is an algorithm developed for hyperparametric optimization . It uses adaptive resource allocation and early stop to quickly converge to high-performance models .
You can be here (https://arxiv.org/pdf/1603.06560.pdf) Read more about this intuition .
But the basic algorithm is shown in the figure below , If you don't understand , Please ignore it and move on . This is a big story that needs another blog .
Hyperband By calculation 1 + log_factor(max_epochs) And round it to the nearest integer to determine the number of models to be trained in brackets .
# Instantiate the tuner
tuner = kt.Hyperband(model_builder, # the hypermodel
objective='val_accuracy', # objective to optimize
max_epochs=10,
factor=3, # factor which you have seen above
directory='dir', # directory to save logs
project_name='khyperband')
# hypertuning settings
tuner.search_space_summary()
Output:-
# Search space summary
# Default search space size: 2
# units (Int)
# {'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': None}
# learning_rate (Choice)
# {'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}
step 4( Search for the best superparameter )
stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5) # Perform hypertuning tuner.search(x_train, y_train, epochs=10, validation_split=0.2, callbacks=[stop_early])
best_hp=tuner.get_best_hyperparameters()[0]
step 5( Use the best hyperparametric reconstruction and training model )
# Build the model with the optimal hyperparameters h_model = tuner.hypermodel.build(best_hps) h_model.summary() h_model.fit(x_train, x_test, epochs=10, validation_split=0.2)
Now? , You can evaluate this model ,
h_eval_dict = h_model.evaluate(img_test, label_test, return_dict=True)
Comparison between using and not using hyperparametric tuning
Baseline model performance :
BASELINE MODEL: number of units in 1st Dense layer: 512 learning rate for the optimizer: 0.0010000000474974513 loss: 0.08013473451137543 accuracy: 0.9794999957084656
HYPERTUNED MODEL: number of units in 1st Dense layer: 224 learning rate for the optimizer: 0.0010000000474974513 loss: 0.07163219898939133 accuracy: 0.979200005531311
- If you see that the training time of the baseline model is more than that of the hyperparametric adjustment model , That's because it has fewer neurons , So it's faster .
- More robust hyperparametric model , You can see the loss of your baseline model and the loss of your overshoot model , So we can say that this is a more robust model .
Endnote
Thank you for reading this article , I hope you find this article very helpful , And you will implement it in your neural network Keras tuner To get a better neural network .
边栏推荐
- Tcapulusdb Jun · industry news collection
- Global and Chinese dealox industry development status and demand trend forecast report 2022-2028
- How does the education industry realize the TRTC interactive classroom apaas solution under the "double reduction policy"?
- Tcapulusdb database: the king behind the "Tianya Mingyue swordsman Tour"
- How to restart the server through the fortress machine how to log in to the fortress machine
- How to do AI speech synthesis? How to download the voice of AI speech synthesis?
- [dry goods] configure failover active/acitve in transparent mode on Cisco ASA firewall
- What are the categories of code signing certificates? What are the differences between different types of certificates?
- [JS reverse hundred examples] md5+aes encryption analysis of an easy payment password
- [read together] Web penetration attack and defense practice (I)
猜你喜欢

layer 3 switch
![[SQL injection 12] user agent injection foundation and Practice (based on burpsuite tool and sqli labs LESS18 target machine platform)](/img/c8/f6c2a62b8ab8fa88bd2b3d8f35f592.jpg)
[SQL injection 12] user agent injection foundation and Practice (based on burpsuite tool and sqli labs LESS18 target machine platform)
![[SQL injection 13] referer injection foundation and Practice (based on burpseuite tool and sqli labs less19 target platform)](/img/b5/a8c4bbaf868dd20b7dc9449d2a4378.jpg)
[SQL injection 13] referer injection foundation and Practice (based on burpseuite tool and sqli labs less19 target platform)

Review of AI hotspots this week: the Gan compression method consumes less than 1/9 of the computing power, and the open source generator turns your photos into hand drawn photos

I, a 27 year old female programmer, feel that life is meaningless, not counting the accumulation fund deposit of 430000

Stm32g474 infrared receiving based on irtim peripherals

It's too difficult for me. Ali has had 7 rounds of interviews (5 years of experience and won the offer of P7 post)
随机推荐
[tcapulusdb knowledge base] how to rebuild tables in tcapulusdb table management?
Global and Chinese alumina nanoparticle market scale and Development Trend Outlook report 2022-2028
Grp: implement GRP timeout interceptor
How to confirm whether IPv6 is enabled for a website
Disruptor note 2: disruptor class analysis
Digital case show ‖ made in China for the first time! Tencent cloud tdsql landed in Zhangjiagang bank and worked together to build financial business
Easycvr's use of Huawei IVS query directory shared information list interface
Easycvr connects with Huawei IVS platform to query the foreign domain list interface definition and use sharing
Go language core 36 lectures (go language practice and application VI) -- learning notes
Research Report on global and Chinese titanium concentrate market scale and investment prospects 2022-2028
[guide to cloud first] point north before tdsql elite challenge
Tencent cloud Weibo was selected into the analysis report on the status quo of China's low code platform market in 2021 by Forrester, an international authoritative research institution
Gin framework: implementing timeout Middleware
Blog platform was falsely blackmailed and the new hacker organization claimed responsibility for the Israeli attack | November 16 global network security hotspot
Build a smart pole visual intercom system through an intelligent gateway
[actual combat] how to realize people nearby through PostGIS
Tcapulusdb Jun · industry news collection
CLB O & M & operation best practices - big insight into access logs
[official time limit activity] in November, if you dare to write, you will get a prize
[technical grass planting] the cloud driver takes you straight to the clouds