当前位置：网站首页>A kind of super parameter optimization technology hyperopt

A kind of super parameter optimization technology hyperopt

2020-11-06 23:29:00 【Artificial intelligence meets pioneer】

5 Line code , Can let wechat small program on the shelf to their own APP in | Register and send it to Dajiang 、 Huawei 、 Cherry keyboard ！>>>

author |GUEST BLOG compile |VK source |Analytics Vidhya

Introduce

In machine learning projects , You need to follow a series of steps , Until you reach your goal , One of the steps you have to perform is to optimize the model you choose . This task is always done after the model selection process （ Choose the best model that performs better than other models ）.

What is hyper parametric optimization ？

Before we define parametric optimization , You need to know what hyperparameters are . in short , Hyperparameters are different parameter values used to control the learning process , It has a significant impact on the performance of machine learning model .

An example of a hyperparameter in a random forest algorithm is the number of estimators （n_estimators）、 Maximum depth （max_depth） And the rules . These parameters are adjustable , It can directly affect the quality of the training model .

Superparametric optimization is to find a suitable combination of superparametric values , In order to achieve the maximum performance of the data in a reasonable time . It plays an important role in the prediction accuracy of machine learning algorithm . therefore , Superparametric optimization is considered to be the most difficult part in building machine learning models .

Most machine learning algorithms have default hyperparametric values . Default values don't always work well in different types of machine learning projects , That's why you need to optimize them , To get the right combination of the best performance .

Good hyperparameters can make an algorithm glow .

There are some common strategies for optimizing the hyperparameters ：

（a） The grid search

This is a widely used traditional method , It determines the optimal value of a given model by performing a super parameter adjustment . Grid search works by trying all possible combinations of parameters in the model , This means that it will take a lot of time to perform the entire search , This can lead to very high computational costs .

Be careful ： You can learn how to implement grid search here :https://github.com/Davisy/Hyperparameter-Optimization-Techniques/blob/master/GridSearchCV%20.ipynb

（b） Random search

When a random combination of superparametric values is used to find the best solution for the constructed model , This method works differently . The disadvantage of random search is that it sometimes leaves out important points in the search space （ value ）.

Be careful ： You can learn more about random search here :https://github.com/Davisy/Hyperparameter-Optimization-Techniques/blob/master/RandomizedSearchCV.ipynb

Superparametric Optimization Technology

In this series of articles , I'm going to introduce you to different advanced hyperparametric optimization techniques / Method , These technologies / Methods can help you get the best parameters for a given model . We will study the following technologies .

Hyperopt
Scikit Optimize
Optuna

In this paper , I will focus on Hyperopt The implementation of the .

What is? Hyperopt

Hyperopt Is a powerful python library , For parametric optimization , from jamesbergstra Development .Hyperopt Parameter adjustment in the form of Bayesian optimization , Allows you to get the best parameters for a given model . It can optimize models with hundreds of parameters over a wide range .

Hyperopt Characteristics of

Hyperopt contain 4 An important feature , You need to know , To run your first optimization .

（a） search space

hyperopt There are different functions to specify the range of input parameters , These are random search spaces . Choose the most common search options ：

hp.choice(label, options)- This can be used for classification parameters , It returns one of the options , It should be a list or tuple . Example :hp.choice(“criterion”, [“gini”,”entropy”,])
hp.randint(label, upper)- Can be used for integer arguments , It returns to the range （0,upper） Random integers within . Example ：hp.randint(“max_features”,50)
hp.uniform(label, low, high)- It returns a value between low and high Between the value of the . Example ：hp.uniform(“max_leaf_nodes”,1,10)

Other options you can use include ：

hp.normal(label, mu, sigma)- This will return an actual value , The value obeys the mean value of mu And the standard deviation is sigma Is a normal distribution
hp.qnormal(label, mu, sigma, q)- Return to a similar round(normal(mu, sigma) / q) * q Value
hp.lognormal(label, mu, sigma)- return exp(normal(mu, sigma))
hp.qlognormal(label, mu, sigma, q) - Return to a similar round(exp(normal(mu, sigma)) / q) * q Value

You can learn more about search space options here ：https://github.com/hyperopt/hyperopt/wiki/FMin#21-parameter-expressions

notes ： Every optimized random expression has a label （ for example n_estimators） As the first parameter . These tags are used to return the parameter selection to the caller during the optimization process .

（b） Objective function

This is a minimization function , It receives the hyperparametric values from the search space as input and returns the loss . This means that in the optimization process , We use the selected hyperparametric numerical training model and predict the target features , The prediction error is then evaluated and returned to the optimizer . The optimizer will decide which values to check and iterate again . You will learn how to create an objective function in a practical example .

（c） fmin

fmin Function is to iterate over different algorithm sets and their super parameters , Then the optimization function that minimizes the objective function .fmin Yes 5 The inputs are ：

Minimizing the objective function
Defined search space
The search algorithms used are random search 、TPE（Tree-Parzen estimator ） And adaptive TPE.

Be careful ：hyperopt.rand.suggest as well as hyperopt.tpe.suggest Provides logic for sequential search in a hyperparametric space .
The maximum number of evaluations
trials object （ Optional ）

Example ：

from hyperopt import fmin, tpe, hp,Trials

trials = Trials()

best = fmin(fn=lambda x: x ** 2,
    		space= hp.uniform('x', -10, 10),
    		algo=tpe.suggest,
    		max_evals=50,
    		trials = trials)

print(best)

（d） Subjects

Trials Object is used to hold all the hyperparameters 、 Loss and other information , This means that you can access them after running optimizations . Besides ,trials Can help you save and load important information , And then continue to optimize the process .（ You'll find out more in the actual example ）.

from hyperopt import Trials 

trials = Trials()

In understanding the Hyperopt After the important characteristics of , The following is an introduction to Hyperopt How to use .

Initialize the space to search .
Define the objective function .
Select the search algorithm to use .
function hyperopt function .
Analyze the evaluation output stored in the test object .

In practice Hyperpot

Now you know Hyperopt Important characteristics of , In this practical example , We're going to use the mobile price dataset , The task is to create a model , The forecast price of mobile devices is 0（ Low cost ） or 1（ Medium cost ） or 2（ high-cost ） or 3（ Very high cost ）.

install Hyperopt

You can start your PyPI install hyperopt.

pip install hyperopt

Then import the important packages

#  Import package 
import numpy as np 
import pandas as pd 
from sklearn.ensemble import RandomForestClassifier 
from sklearn import metrics
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import StandardScaler 
from hyperopt import tpe, hp, fmin, STATUS_OK,Trials
from hyperopt.pyll.base import scope

import warnings
warnings.filterwarnings("ignore")

Data sets

Let's load the dataset from the data catalog . For more information about this dataset ：https://www.kaggle.com/iabhishekofficial/mobile-price-classification?select=train.csv

#  Load data 

data = pd.read_csv("data/mobile_price_data.csv")

Check the first five lines of the dataset .

#  Reading data 

data.head()

As you can see , In our dataset , We have different numerical characteristics .

Let's look at the shape of the dataset .

#  Show shape 

data.shape

(2000, 21)

In this dataset , We have 2000 Row sum 21 Column . Now let's take a look at the list of features in this dataset .

# Display list  

list(data.columns)

[‘battery_power’, ‘blue’, ‘clock_speed’, ‘dual_sim’, ‘fc’, ‘four_g’, ‘int_memory’, ‘m_dep’, ‘mobile_wt’, ‘n_cores’, ‘pc’, ‘px_height’, ‘px_width’, ‘ram’, ‘sc_h’, ‘sc_w’, ‘talk_time’, ‘three_g’, ‘touch_screen’, ‘wifi’, ‘price_range’]

You can find the meaning of each column name here ：https://www.kaggle.com/iabhishekofficial/mobile-price-classification

The data set is decomposed into target feature and independent feature

This is a question of classification , We're going to separate the target feature and the independent feature from the dataset . Our target is the price range .

#  Split the data into features and targets 

X = data.drop("price_range", axis=1).values 
y = data.price_range.values

Preprocessing datasets

And then use scikit-learn Medium StandardScaler Methods to standardize the independent features .

#  Standardized characteristic variables 

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

Define parameter space for optimization

We're going to use the three super parameters of the random forest algorithm , namely n_estimators、max_depth and criterion.

space = {
    "n_estimators": hp.choice("n_estimators", [100, 200, 300, 400,500,600]),
    "max_depth": hp.quniform("max_depth", 1, 15,1),
    "criterion": hp.choice("criterion", ["gini", "entropy"]),
}

We have set different values in the hyperparameters selected above . Then define the objective function .

Define the minimization function （ Objective function ）

Our minimization function is called hyper parameter adjustment , The classification algorithm to optimize its super parameters is random forest . I use cross validation to avoid over fitting , The function then returns a loss value and its status .

#  Define the objective function 

def hyperparameter_tuning(params):
    clf = RandomForestClassifier(**params,n_jobs=-1)
    acc = cross_val_score(clf, X_scaled, y,scoring="accuracy").mean()
    return {"loss": -acc, "status": STATUS_OK}

Be careful ： remember hyperopt Minimize the function , So I am acc A negative sign has been added to ：

Fine tune the model

Last , Instantiate first Trial object , Fine tune the model , Then print out the best loss with its over parameter value .

#  initialization Trial  object 
trials = Trials()

best = fmin(
    fn=hyperparameter_tuning,
    space = space, 
    algo=tpe.suggest, 
    max_evals=100, 
    trials=trials
)

print("Best: {}".format(best))

100%|█████████████████████████████████████████████████████████| 100/100 [10:30<00:00, 6.30s/trial, best loss: -0.8915] Best: {‘criterion’: 1, ‘max_depth’: 11.0, ‘n_estimators’: 2}.

After super parameter optimization , The loss is -0.8915, Using the random forest classifier n_estimators=300,max_depth=11,criterian=“entropy”, The accuracy of the model performance is 89.15%.

Use trials Object analysis results

trials Object can help us check all the return values calculated during the experiment .

（ One ）trials.results

This shows that during the search “objective” List of dictionaries returned .

trials.results

[{‘loss’: -0.8790000000000001, ‘status’: ‘ok’}, {‘loss’: -0.877, ‘status’: ‘ok’}, {‘loss’: -0.768, ‘status’: ‘ok’}, {‘loss’: -0.8205, ‘status’: ‘ok’}, {‘loss’: -0.8720000000000001, ‘status’: ‘ok’}, {‘loss’: -0.883, ‘status’: ‘ok’}, {‘loss’: -0.8554999999999999, ‘status’: ‘ok’}, {‘loss’: -0.8789999999999999, ‘status’: ‘ok’}, {‘loss’: -0.595, ‘status’: ‘ok’},…….]

（ Two ）trials.losses()

This shows a list of losses

trials.losses()

[-0.8790000000000001, -0.877, -0.768, -0.8205, -0.8720000000000001, -0.883, -0.8554999999999999, -0.8789999999999999, -0.595, -0.8765000000000001, -0.877, ………]

（ 3、 ... and ）trials.statuses()

This will display a list of status strings .

trials.statuses()

[‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ……….]

notes ： This subject can be preserved , Pass to the built-in drawing routine , Or use your own custom code to analyze .

ending

congratulations , You have finished this article

You can download the datasets and notebooks used in this article here ：https://github.com/Davisy/Hyperparameter-Optimization-technologies

Link to the original text ：https://www.analyticsvidhya.com/blog/2020/09/alternative-hyperparameter-optimization-technique-you-need-to-know-hyperopt/

Welcome to join us AI Blog station ： http://panchuang.net/

sklearn Machine learning Chinese official documents ： http://sklearn123.com/

Welcome to pay attention to pan Chuang blog resource summary station ： http://docs.panchuang.net/

版权声明
本文为[Artificial intelligence meets pioneer]所创，转载请带上原文链接，感谢