当前位置:网站首页>Deep Learning Theory - Overfitting, Underfitting, Regularization, Optimizers
Deep Learning Theory - Overfitting, Underfitting, Regularization, Optimizers
2022-08-04 06:19:00 【Learning Adventures】
Data augmentation: 1. Do not overdo it, otherwise it will only increase the training time and will not increase the generalization ability; 2.Add extraneous data
L2 regularity: tend to respond to the common characteristics of training set samples; make the model prefer samples with small parameters to reduce the risk of overfitting
Several common optimizers
For sparse data, try to choose an optimization method with an adaptive learning rate. It does not need to be adjusted manually. It is better to use the default value.
Stochastic gradient descent usually takes longer to train and is prone to saddle points, but results are more reliable with good initialization and learning rate scheduling.
Overall, Adam is by far the best choice.
边栏推荐
- 【论文阅读】Anchor-Free Person Search
- AIDL communication between two APPs
- YOLOV4流程图(方便理解)
- Dictionary feature extraction, text feature extraction.
- Learning curve learning_curve function in sklearn
- Golang环境变量设置(二)--GOMODULE&GOPROXY
- TensorFlow2 study notes: 5. Common activation functions
- MNIST手写数字识别 —— ResNet-经典卷积神经网络
- "A minute" Copy siege lion log 】 【 run MindSpore LeNet model
- 动手学深度学习_卷积神经网络CNN
猜你喜欢
Transformer
[Deep Learning 21-Day Learning Challenge] 3. Use a self-made dataset - Convolutional Neural Network (CNN) Weather Recognition
[Deep Learning Diary] Day 1: Hello world, Hello CNN MNIST
【CV-Learning】Object Detection & Instance Segmentation
基于BiGRU和GAN的数据生成方法
迅雷关闭自动更新
tensorRT5.15 使用中的注意点
Copy攻城狮信手”粘“来 AI 对对联
[Go language entry notes] 13. Structure (struct)
Copy Siege Lion 5-minute online experience MindIR format model generation
随机推荐
深度学习理论 —— 初始化、参数调节
浅谈游戏音效测试点
Dictionary feature extraction, text feature extraction.
Briefly say Q-Q map; stats.probplot (QQ map)
lstm pipeline 过程理解(输入输出)
【CV-Learning】图像分类
Copy Siege Lion 5-minute online experience MindIR format model generation
fuser 使用—— YOLOV5内存溢出——kill nvidai-smi 无pid 的 GPU 进程
腾讯、网易纷纷出手,火到出圈的元宇宙到底是个啥?
AIDL communication between two APPs
fill_between in Matplotlib; np.argsort() function
Transformer
2020-10-29
度量学习(Metric learning、损失函数、triplet、三元组损失、fastreid)
Brief description of database and common operation guide
软著撰写注意事项
2020-10-19
动手学深度学习__数据操作
MFC读取点云,只能正常显示第一个,显示后面时报错
Logistic Regression --- Introduction, API Introduction, Case: Cancer Classification Prediction, Classification Evaluation, and ROC Curve and AUC Metrics