当前位置:网站首页>A kind of super parameter optimization technology hyperopt
A kind of super parameter optimization technology hyperopt
2020-11-06 23:29:00 【Artificial intelligence meets pioneer】
author |GUEST BLOG compile |VK source |Analytics Vidhya
Introduce
In machine learning projects , You need to follow a series of steps , Until you reach your goal , One of the steps you have to perform is to optimize the model you choose . This task is always done after the model selection process ( Choose the best model that performs better than other models ).
What is hyper parametric optimization ?
Before we define parametric optimization , You need to know what hyperparameters are . in short , Hyperparameters are different parameter values used to control the learning process , It has a significant impact on the performance of machine learning model .
An example of a hyperparameter in a random forest algorithm is the number of estimators (n_estimators)、 Maximum depth (max_depth) And the rules . These parameters are adjustable , It can directly affect the quality of the training model .
Superparametric optimization is to find a suitable combination of superparametric values , In order to achieve the maximum performance of the data in a reasonable time . It plays an important role in the prediction accuracy of machine learning algorithm . therefore , Superparametric optimization is considered to be the most difficult part in building machine learning models .
Most machine learning algorithms have default hyperparametric values . Default values don't always work well in different types of machine learning projects , That's why you need to optimize them , To get the right combination of the best performance .
Good hyperparameters can make an algorithm glow .
There are some common strategies for optimizing the hyperparameters :
(a) The grid search
This is a widely used traditional method , It determines the optimal value of a given model by performing a super parameter adjustment . Grid search works by trying all possible combinations of parameters in the model , This means that it will take a lot of time to perform the entire search , This can lead to very high computational costs .
Be careful : You can learn how to implement grid search here :https://github.com/Davisy/Hyperparameter-Optimization-Techniques/blob/master/GridSearchCV%20.ipynb
(b) Random search
When a random combination of superparametric values is used to find the best solution for the constructed model , This method works differently . The disadvantage of random search is that it sometimes leaves out important points in the search space ( value ).
Be careful : You can learn more about random search here :https://github.com/Davisy/Hyperparameter-Optimization-Techniques/blob/master/RandomizedSearchCV.ipynb
Superparametric Optimization Technology
In this series of articles , I'm going to introduce you to different advanced hyperparametric optimization techniques / Method , These technologies / Methods can help you get the best parameters for a given model . We will study the following technologies .
- Hyperopt
- Scikit Optimize
- Optuna
In this paper , I will focus on Hyperopt The implementation of the .
What is? Hyperopt
Hyperopt Is a powerful python library , For parametric optimization , from jamesbergstra Development .Hyperopt Parameter adjustment in the form of Bayesian optimization , Allows you to get the best parameters for a given model . It can optimize models with hundreds of parameters over a wide range .
Hyperopt Characteristics of
Hyperopt contain 4 An important feature , You need to know , To run your first optimization .
(a) search space
hyperopt There are different functions to specify the range of input parameters , These are random search spaces . Choose the most common search options :
- hp.choice(label, options)- This can be used for classification parameters , It returns one of the options , It should be a list or tuple . Example :hp.choice(“criterion”, [“gini”,”entropy”,])
- hp.randint(label, upper)- Can be used for integer arguments , It returns to the range (0,upper) Random integers within . Example :hp.randint(“max_features”,50)
- hp.uniform(label, low, high)- It returns a value between low and high Between the value of the . Example :hp.uniform(“max_leaf_nodes”,1,10)
Other options you can use include :
- hp.normal(label, mu, sigma)- This will return an actual value , The value obeys the mean value of mu And the standard deviation is sigma Is a normal distribution
- hp.qnormal(label, mu, sigma, q)- Return to a similar round(normal(mu, sigma) / q) * q Value
- hp.lognormal(label, mu, sigma)- return exp(normal(mu, sigma))
- hp.qlognormal(label, mu, sigma, q) - Return to a similar round(exp(normal(mu, sigma)) / q) * q Value
You can learn more about search space options here :https://github.com/hyperopt/hyperopt/wiki/FMin#21-parameter-expressions
notes : Every optimized random expression has a label ( for example n_estimators) As the first parameter . These tags are used to return the parameter selection to the caller during the optimization process .
(b) Objective function
This is a minimization function , It receives the hyperparametric values from the search space as input and returns the loss . This means that in the optimization process , We use the selected hyperparametric numerical training model and predict the target features , The prediction error is then evaluated and returned to the optimizer . The optimizer will decide which values to check and iterate again . You will learn how to create an objective function in a practical example .
(c) fmin
fmin Function is to iterate over different algorithm sets and their super parameters , Then the optimization function that minimizes the objective function .fmin Yes 5 The inputs are :
-
Minimizing the objective function
-
Defined search space
-
The search algorithms used are random search 、TPE(Tree-Parzen estimator ) And adaptive TPE.
Be careful :hyperopt.rand.suggest as well as hyperopt.tpe.suggest Provides logic for sequential search in a hyperparametric space .
-
The maximum number of evaluations
-
trials object ( Optional )
Example :
from hyperopt import fmin, tpe, hp,Trials
trials = Trials()
best = fmin(fn=lambda x: x ** 2,
space= hp.uniform('x', -10, 10),
algo=tpe.suggest,
max_evals=50,
trials = trials)
print(best)
(d) Subjects
Trials Object is used to hold all the hyperparameters 、 Loss and other information , This means that you can access them after running optimizations . Besides ,trials Can help you save and load important information , And then continue to optimize the process .( You'll find out more in the actual example ).
from hyperopt import Trials
trials = Trials()
In understanding the Hyperopt After the important characteristics of , The following is an introduction to Hyperopt How to use .
-
Initialize the space to search .
-
Define the objective function .
-
Select the search algorithm to use .
-
function hyperopt function .
-
Analyze the evaluation output stored in the test object .
In practice Hyperpot
Now you know Hyperopt Important characteristics of , In this practical example , We're going to use the mobile price dataset , The task is to create a model , The forecast price of mobile devices is 0( Low cost ) or 1( Medium cost ) or 2( high-cost ) or 3( Very high cost ).
install Hyperopt
You can start your PyPI install hyperopt.
pip install hyperopt
Then import the important packages
# Import package
import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn import metrics
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import StandardScaler
from hyperopt import tpe, hp, fmin, STATUS_OK,Trials
from hyperopt.pyll.base import scope
import warnings
warnings.filterwarnings("ignore")
Data sets
Let's load the dataset from the data catalog . For more information about this dataset :https://www.kaggle.com/iabhishekofficial/mobile-price-classification?select=train.csv
# Load data
data = pd.read_csv("data/mobile_price_data.csv")
Check the first five lines of the dataset .
# Reading data
data.head()
As you can see , In our dataset , We have different numerical characteristics .
Let's look at the shape of the dataset .
# Show shape
data.shape
(2000, 21)
In this dataset , We have 2000 Row sum 21 Column . Now let's take a look at the list of features in this dataset .
# Display list
list(data.columns)
[‘battery_power’, ‘blue’, ‘clock_speed’, ‘dual_sim’, ‘fc’, ‘four_g’, ‘int_memory’, ‘m_dep’, ‘mobile_wt’, ‘n_cores’, ‘pc’, ‘px_height’, ‘px_width’, ‘ram’, ‘sc_h’, ‘sc_w’, ‘talk_time’, ‘three_g’, ‘touch_screen’, ‘wifi’, ‘price_range’]
You can find the meaning of each column name here :https://www.kaggle.com/iabhishekofficial/mobile-price-classification
The data set is decomposed into target feature and independent feature
This is a question of classification , We're going to separate the target feature and the independent feature from the dataset . Our target is the price range .
# Split the data into features and targets
X = data.drop("price_range", axis=1).values
y = data.price_range.values
Preprocessing datasets
And then use scikit-learn Medium StandardScaler Methods to standardize the independent features .
# Standardized characteristic variables
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
Define parameter space for optimization
We're going to use the three super parameters of the random forest algorithm , namely n_estimators、max_depth and criterion.
space = {
"n_estimators": hp.choice("n_estimators", [100, 200, 300, 400,500,600]),
"max_depth": hp.quniform("max_depth", 1, 15,1),
"criterion": hp.choice("criterion", ["gini", "entropy"]),
}
We have set different values in the hyperparameters selected above . Then define the objective function .
Define the minimization function ( Objective function )
Our minimization function is called hyper parameter adjustment , The classification algorithm to optimize its super parameters is random forest . I use cross validation to avoid over fitting , The function then returns a loss value and its status .
# Define the objective function
def hyperparameter_tuning(params):
clf = RandomForestClassifier(**params,n_jobs=-1)
acc = cross_val_score(clf, X_scaled, y,scoring="accuracy").mean()
return {"loss": -acc, "status": STATUS_OK}
Be careful : remember hyperopt Minimize the function , So I am acc A negative sign has been added to :
Fine tune the model
Last , Instantiate first Trial object , Fine tune the model , Then print out the best loss with its over parameter value .
# initialization Trial object
trials = Trials()
best = fmin(
fn=hyperparameter_tuning,
space = space,
algo=tpe.suggest,
max_evals=100,
trials=trials
)
print("Best: {}".format(best))
100%|█████████████████████████████████████████████████████████| 100/100 [10:30<00:00, 6.30s/trial, best loss: -0.8915] Best: {‘criterion’: 1, ‘max_depth’: 11.0, ‘n_estimators’: 2}.
After super parameter optimization , The loss is -0.8915, Using the random forest classifier n_estimators=300,max_depth=11,criterian=“entropy”, The accuracy of the model performance is 89.15%.
Use trials Object analysis results
trials Object can help us check all the return values calculated during the experiment .
( One )trials.results
This shows that during the search “objective” List of dictionaries returned .
trials.results
[{‘loss’: -0.8790000000000001, ‘status’: ‘ok’}, {‘loss’: -0.877, ‘status’: ‘ok’}, {‘loss’: -0.768, ‘status’: ‘ok’}, {‘loss’: -0.8205, ‘status’: ‘ok’}, {‘loss’: -0.8720000000000001, ‘status’: ‘ok’}, {‘loss’: -0.883, ‘status’: ‘ok’}, {‘loss’: -0.8554999999999999, ‘status’: ‘ok’}, {‘loss’: -0.8789999999999999, ‘status’: ‘ok’}, {‘loss’: -0.595, ‘status’: ‘ok’},…….]
( Two )trials.losses()
This shows a list of losses
trials.losses()
[-0.8790000000000001, -0.877, -0.768, -0.8205, -0.8720000000000001, -0.883, -0.8554999999999999, -0.8789999999999999, -0.595, -0.8765000000000001, -0.877, ………]
( 3、 ... and )trials.statuses()
This will display a list of status strings .
trials.statuses()
[‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ‘ok’, ……….]
notes : This subject can be preserved , Pass to the built-in drawing routine , Or use your own custom code to analyze .
ending
congratulations , You have finished this article
You can download the datasets and notebooks used in this article here :https://github.com/Davisy/Hyperparameter-Optimization-technologies
Link to the original text :https://www.analyticsvidhya.com/blog/2020/09/alternative-hyperparameter-optimization-technique-you-need-to-know-hyperopt/
Welcome to join us AI Blog station : http://panchuang.net/
sklearn Machine learning Chinese official documents : http://sklearn123.com/
Welcome to pay attention to pan Chuang blog resource summary station : http://docs.panchuang.net/
版权声明
本文为[Artificial intelligence meets pioneer]所创,转载请带上原文链接,感谢
边栏推荐
- The method of local search port number occupation in Windows system
- vue3 新特性
- 2020-08-15: under what circumstances should data tasks be optimized?
- JS string - string string object method
- Insomnia all night
- idea 激活到 2089 失效
- 插件Bilibili新版0.5.5
- Jenkins installation and deployment process
- Dynamsoft barcode reader v7.5!
- Design of NAND flash interface control
猜你喜欢
2018中国云厂商TOP5:阿里云、腾讯云、AWS、电信、联通 ...
The advantages and functions of psychological counseling app
Application layer software development Godfather teaches you how to refactor, senior programmers must professional skills
Image processing toolkit imagexpresshow to view events
南京标志设计,logo设计公司
Common mathematical basic formulas of recursive and backtracking algorithms
8.Swarm创建维护和水平扩展Service
yum [Errno 256] No more mirrors to try 解决方法
WebAPI接口设计:SwaggerUI文档 / 统一响应格式 / 统一异常处理 / 统一权限验证
Idea activation to 2089 failure
随机推荐
How does LeadTools detect, read and write barcodes
To solve the problem that the data interface is not updated after WPF binding set
超高频RFID医疗血液管理系统应用
8.Swarm创建维护和水平扩展Service
How to add modules to nginx image?
The method of local search port number occupation in Windows system
Es create a new index database and copy the old index library, practice pro test effective!
Test the necessary skill points of siege lion! This article takes you to interpret the testing technology under Devops
失眠一个整晚上
Big data processing black Technology: revealing the parallel computing technology of Pb level data warehouse gaussdb (DWS)
Cloudquery v1.2.0 release
2020年新规,微信封号怎么快速解除?
.NETCore3.1+ Vue.js Low code workflow engine
应用层软件开发教父教你如何重构,资深程序员必备专业技能
20个XR项目路演,近20个资本机构出席!诚邀您参加2020 Qualcomm XR生态合作伙伴大会
如何才能快速正确的部署甘特图
Web API interface design: swaggerui document / unified response format / unified exception handling / unified authority verification
How to deploy Gantt chart quickly and correctly
idea 激活到 2089 失效
汽车维修app开发的好处与功能