当前位置:网站首页>The training set Loss converges, but the test set Loss oscillates violently?
The training set Loss converges, but the test set Loss oscillates violently?
2022-08-05 11:38:00 【GIS and Climate】
问题场景
Today, when debugging the model, I found that it is on the training setLoss已经收敛了,但是在验证集上LossVibration is more severe,如下图所示:

原因分析
Found it after checking various blogs online验证集Loss震荡的原因可能有如下:
数据问题,For example, the training set and the validation set are too different,数据量太小; batchsize太小,The rules of model learning are not enough“普适”; lossThe function is not suitable; 学习率太大,The model gets stuck in a local optimum; There is a problem with the network structure of the model; ......
After knowing the reason,can be checked one by one.
数据上,Check your own datasettrain和valid的划分情况,Basically the distribution of the data should be about the same;数据量上1w+的图像,应该也还行; lossThe function is changed to other tests and the effect is still the same; The learning rate uses a dynamic adjustment strategy,应该没什么问题(In the later tests, even if the initial learning rate is adjusted,The end result is still similar); The model uses a more classic super-score model,应该问题不大; 调整了下bs,从32调整到48,The vibration was found to be smaller,效果如下图:


So the final analysis should bebatchsize太小的原因,If the point estimation can be increased, the effect will be better,但是奈何GPUNot enough memory.
总结
如果遇到Losshas converged on the training set,However, the shock on the validation set is more severe,Analyze the possible causes one by one,and try it.When trying, you should also pay attention to the theoretical analysis before running the model,Otherwise, it may be a waste of computing power.

参考
【1】https://blog.csdn.net/qq_40689236/article/details/106794155
【2】https://zhuanlan.zhihu.com/p/483488388
边栏推荐
- 提取人脸特征的三种方法
- Linux:记一次CentOS7安装MySQL8(博客合集)
- WPF开发随笔收录-WriteableBitmap绘制高性能曲线图
- .NET in-depth analysis of the LINQ framework (6: LINQ execution expressions)
- 数据治理体系演进简介
- 没开发人员,接到开发物联网系统的活儿,干不干?
- 花的含义
- Student Information Management System (first time...)
- TiDB 6.0 Placement Rules In SQL Usage Practice
- UDP communication
猜你喜欢

TiDB 6.0 Placement Rules In SQL 使用实践

623. Add a row to a binary tree: Simple binary tree traversal problems

学习用于视觉跟踪的深度紧凑图像表示

手把手教你定位线上MySQL慢查询问题,包教包会

shell编程流程控制练习

CenOS MySQL入门及安装

互联网行业凛冬之至,BATM的程序员是如何应对中年危机的?

Flink Yarn Per Job - 启动TM,向RM注册,RM分配solt

小红的aba子序列(离散化、二分、dp维护区间最短)

Apache APISIX Ingress v1.5-rc1 发布
随机推荐
knife4j
How to write a blog with Golang - Milu.blog development summary
提取人脸特征的三种方法
2022杭电多校联赛第六场 题解
How about Ping An Mengwa Card Insurance?Let parents read a few ways to identify products
双因子与多因子身份验证有什么区别?
五大理由告诉你为什么开发人员选择代码质量静态分析工具Klocwork来实现软件安全
女人是这个世界上最美丽的生命
Discover the joy of C language
多业务模式下的交易链路探索与实践
机器学习——逻辑回归
【深度学习】mmclassification mmcls 实战多标签分类任务教程,分类任务
5G NR 系统消息
Machine Learning - Ensemble Learning
[7.29-8.5] Review of wonderful technical blog posts in the writing community
.NET深入解析LINQ框架(六:LINQ执行表达式)
张朝阳对话俞敏洪:一边是手推物理公式,一边是古诗信手拈来
Hands-on Deep Learning_GoogLeNet / Inceptionv1v2v3v4
Exploration and practice of transaction link under multi-service mode
Nature:猪死亡1小时后,器官再次运转