当前位置:网站首页>Hyperparameter Optimization - Excerpt
Hyperparameter Optimization - Excerpt
2022-07-31 06:14:00 【Young_win】
Introduction to Hyperparameter Optimization
When building a deep learning model, you must decide: how many layers should be stacked?How many cells or filters should each layer contain?Should activation use relu or some other function?Should BatchNormalization be used after a certain layer?What dropout ratio should I use?These parameters at the architectural level are called hyperparameters.Correspondingly, model parameters can be trained by backpropagation.
The way to adjust hyperparameters is generally "to formulate a principle to systematically and automatically explore the possible decision space".Search the architecture space and empirically find the best performing architecture.The process of hyperparameter optimization:
1.) Select a set of hyperparameters (automatic selection);
2.) Build the corresponding model;
3.) Fit the model on the training data and measureits final performance on validation data;
4.) choose the next set of hyperparameters to try (automatic selection);
5.) repeat the above process;
6.) finally, measure the model atPerformance on test data.
The key to this process is that, given many sets of hyperparameters, the history of validation performance is used to select the next set of hyperparameters to evaluate of algorithms, such as: Bayesian Optimization, Genetic Algorithms, Simple Random Search, etc.
Hyperparameter optimization v.s. model parameter optimization
Training the model weights is relatively simple, that is, calculating the loss function on a small batch of data, and then using the backpropagation algorithm to make the weights move in the correct direction.
Hyperparameter optimization: (1.) Computing the feedback signal - whether this set of hyperparameters results in a high-performance model on this task - can be very computationally expensive, and it requires creating aNew models and trained from scratch; (2.) The hyperparameter space usually consists of many discrete decisions and thus is neither continuous nor differentiable.Therefore, in general can't do gradient descent in hyperparameter space.Instead, you have to rely on optimization methods that don't use gradients, which are much less efficient than gradient descent.
In general, random search - randomly selecting the hyperparameters to evaluate, and repeating the process - is the best solution.a.) The Python tool library Hyperopt is a hyperparameter optimization tool that internally uses Parzen to estimate its tree to predict which set of hyperparameters may yield good results.b.) The Hyperas library is to integrate Hyperopt with the Keras model.
An important issue to keep in mind when doing large-scale hyperparameter field optimization is validation overfitting.Because you are using the validation data to calculate a signal, and then updating the hyperparameters based on that signal, you are actually training the hyperparameters on the validation data and will soon overfit the validation data.
边栏推荐
- Fluorescein-PEG-DSPE 磷脂-聚乙二醇-荧光素荧光磷脂PEG衍生物
- 词向量——demo
- DSPE-PEG-Azide DSPE-PED-N3 磷脂-聚乙二醇-叠氮脂质PFG
- cocos2d-x-3.2 不能混合颜色修改
- quick lua加密
- powershell statistics folder size
- VS connects to MYSQL through ODBC (1)
- RuntimeError: CUDA error: no kernel image is available for execution on the device问题记录
- 我的训练函数模板(动态修改学习率、参数初始化、优化器选择)
- understand js operators
猜你喜欢
随机推荐
科研试剂Cholesterol-PEG-Maleimide,CLS-PEG-MAL,胆固醇-聚乙二醇-马来酰亚胺
MySQL 主从切换步骤
VS通过ODBC连接MYSQL(二)
podspec 校验依赖出错问题 pod lib lint ,需要指定源
The browser looks for events bound or listened to by js
Podspec automatic upgrade script
unicloud cloud development record
Notes on creating a new virtual machine in Hyper-V
2022 SQL big factory high-frequency practical interview questions (detailed analysis)
MySQL 免安装版的下载与配置教程
评估机器学习模型-摘抄
计算图像数据集均值和方差
cocoscreator 显示刘海内容
VS connects to MYSQL through ODBC (1)
ROS 之订阅多个topic时间同步问题
QT VS中双击ui文件无法打开的问题
MYSQL事务与锁问题处理
SQLite 查询表中每天插入的数量
2021年软件测试面试题大全
自然语言处理相关list









