当前位置:网站首页>AI briefing how to use loss surfaces a model integration
AI briefing how to use loss surfaces a model integration
2022-07-28 00:07:00 【InfoQ】
1. background

- DNN There are many local Optimalities on the loss surface of , And the local best points are isolated from each other in surface space .
- This paper finds that there is a connected region in the weight space between isolated local optima , In this area train loss Very close , The following figure in the middle and right .

2. Reading
- What are the benefits of finding such a connected area ?

3. Method and implementation details
3.1 Method
- use cycle Periodic learning rate strategy (lr stay [a1, a2])

3.2 Implementation details
- According to the original SGD Way to train an initial weight model w
- With w Is the initial weight , perform cycle Periodic learning rate learning
- Get the model at the end of each cycle , The prediction results are fused

边栏推荐
- 测试类中的断言机制
- How to use FTP to realize automatic update of WinForm
- Zcmu--1720: death is like the wind, I want to pretend to force
- Is it really hard to understand? What level of cache is the recyclerview caching mechanism?
- Arm32进行远程调试
- Put cloudflare on the website (take Tencent cloud as an example)
- 面试官问线程安全的List,看完再也不怕了!
- Key points of data management
- Latex中如何加粗字体 & 如何打出圆圈序号
- 资深如何确定软件测试结束的标准
猜你喜欢

How to bold font in Latex & how to make circle serial number

物联网有助于应对气候变化的 3 种方式
![[development tutorial 11] crazy shell arm function mobile phone timer experimental tutorial](/img/b2/9f046e6251366c980cc2aa3b71116f.png)
[development tutorial 11] crazy shell arm function mobile phone timer experimental tutorial

Decrypt the secret of 90% reduction in oom crash~

BUUCTF-RSA4

Bank marketing predicts the success rate of a customer's purchase of financial products

If we were the developer responsible for repairing the collapse of station B that night

J9数字科普:Sui网络的双共识是如何工作的?

Common errors reported by ant sword

解密 OOM 崩溃下降 90% 的秘密~
随机推荐
Features of hardwired controller:
Redefine analysis - release of eventbridge real-time event analysis platform
2022/7/26
基于mediapipe的姿态识别和简单行为识别
如果我们是那晚负责修复 B 站崩了的开发人员
BUUCTF-[BJDCTF2020]RSA1
Bank Marketing预测一个客户购买理财产品的成功率
尚硅谷尚品项目汇笔记(一)
Comparison between virtual memory and cache
[C language] address book (dynamic version)
窗口函数over
How to deal with the website after it is hacked and how to delete batch malicious code
NDK series (6): let's talk about the way and time to register JNI functions
Smartrefresh nested multiple recycleview sliding conflicts and incomplete layout display
Lua basic grammar learning
[GWCTF 2019]BabyRSA1
BUUCTF-Baby RSA
14、 C pointer explanation (IV): pointer of pointer
Xss.haozi.me practice customs clearance
JS提升:JS中的数组扁平化问题