当前位置:网站首页>The training set Loss converges, but the test set Loss oscillates violently?
The training set Loss converges, but the test set Loss oscillates violently?
2022-08-05 11:38:00 【GIS and Climate】
问题场景
Today, when debugging the model, I found that it is on the training setLoss已经收敛了,但是在验证集上LossVibration is more severe,如下图所示:

原因分析
Found it after checking various blogs online验证集Loss震荡的原因可能有如下:
数据问题,For example, the training set and the validation set are too different,数据量太小; batchsize太小,The rules of model learning are not enough“普适”; lossThe function is not suitable; 学习率太大,The model gets stuck in a local optimum; There is a problem with the network structure of the model; ......
After knowing the reason,can be checked one by one.
数据上,Check your own datasettrain和valid的划分情况,Basically the distribution of the data should be about the same;数据量上1w+的图像,应该也还行; lossThe function is changed to other tests and the effect is still the same; The learning rate uses a dynamic adjustment strategy,应该没什么问题(In the later tests, even if the initial learning rate is adjusted,The end result is still similar); The model uses a more classic super-score model,应该问题不大; 调整了下bs,从32调整到48,The vibration was found to be smaller,效果如下图:


So the final analysis should bebatchsize太小的原因,If the point estimation can be increased, the effect will be better,但是奈何GPUNot enough memory.
总结
如果遇到Losshas converged on the training set,However, the shock on the validation set is more severe,Analyze the possible causes one by one,and try it.When trying, you should also pay attention to the theoretical analysis before running the model,Otherwise, it may be a waste of computing power.

参考
【1】https://blog.csdn.net/qq_40689236/article/details/106794155
【2】https://zhuanlan.zhihu.com/p/483488388
边栏推荐
- Android 开发用 Kotlin 编程语言一 基本数据类型
- 工程设备在线监测管理系统自动预警功能
- Oracle的自动段空间管理怎么关闭?
- 硅谷来信:快速行动,Facebook、Quora等成功的“神器”!
- LeetCode刷题(8)
- Apache APISIX Ingress v1.5-rc1 发布
- Version Control | Longzhi invites you to go to the GOPS Global Operation and Maintenance Conference to explore the road of large-scale, agile, high-quality and open software development and operation
- Flink Yarn Per Job - JobManger 申请 Slot
- .NET in-depth analysis of the LINQ framework (6: LINQ execution expressions)
- Go编译原理系列9(函数内联)
猜你喜欢
随机推荐
Flink Yarn Per Job - RM启动SlotManager
Integration testing of software testing
Gray value and thermal imaging understanding
Google启动通用图像嵌入挑战赛
脱光衣服待着就能减肥,当真有这好事?
STM32 entry development: write XPT2046 resistive touch screen driver (analog SPI)
365天挑战LeetCode1000题——Day 050 在二叉树中增加一行 二叉树
What do T and Z in the time format 2020-01-13T16:00:00.000Z represent and how to deal with them
163_技巧_Power BI 一键批量建立自定义字段参数
朴素贝叶斯
花的含义
Mathcad 15.0软件安装包下载及安装教程
Android 开发用 Kotlin 编程语言一 基本数据类型
hdu4545 魔法串
Student Information Management System (first time...)
Learning Deep Compact Image Representations for Visual Tracking
How about Ping An Mengwa Card Insurance?Let parents read a few ways to identify products
Image segmentation model - a combination of segmentation_models_pytorch and albumations to achieve multi-category segmentation
大佬们 我是新手,我根据文档用flinksql 写个简单的用户访问量的count 但是执行一次就结束
金融业“限薪令”出台/ 软银出售过半阿里持仓/ DeepMind新实验室成立... 今日更多新鲜事在此...









