当前位置:网站首页>About snake equation (2)
About snake equation (2)
2022-07-08 01:27:00 【yangxue_ mifen】
Reference source :(6 A private letter / 20 Bar message ) What is gradient descent method ? - You know (zhihu.com) ------------ Ma classmate
Every gradient vector The module length is listed , It can be seen that it is decreasing , Therefore, this method is called gradient descent method :
It's easy to understand , When finally tends to 0 From time to tome :
So the minimum value obtained by gradient descent method ( Or nearby ).
Step setting :
As mentioned above, you can pass the step To control the distance of each movement , Let's take a look at the impact of asynchronous length on the final result .
The step size is too small
If set It's too small , iteration 20 After that, it is still far from the bottom of the valley , actually 100 You can't reach the bottom after times .
Appropriate step size
If set ,, Is a more appropriate step ,10 The minimum value is almost found once :
The step size is too big
If set , At this time, it will vibrate back and forth ( The figure below looks like there are only two points , In fact, back and forth between these two points ):
Sum up , Different steps As the number of iterations increases , Will result in optimized functions Of Values vary :
Find the right step , It's a craft job , The above picture can be drawn in the project , Manually adjust according to the image :
It's almost linear ( The blue line ) May be smaller , It needs to be raised
Go up ( Red thread ), Naturally Too big , Need to be lowered .
At first, the decline was particularly rapid , Then there is little change ( Brown line ), May be more , Need to be lowered .
边栏推荐
- Several frequently used OCR document scanning tools | no watermark | avoid IQ tax
- Common effects of line chart
- Understanding of maximum likelihood estimation
- 4. Apprentissage stratégique
- 正则表达式
- Problems of font legend and time scale display of MATLAB drawing coordinate axis
- Frequency probability and Bayesian probability
- Gnuradio operation error: error thread [thread per block [12]: < block OFDM_ cyclic_ prefixer(8)>]: Buffer too small
- 5. Discrete control and continuous control
- AttributeError: ‘str‘ object has no attribute ‘strftime‘
猜你喜欢
Understanding of maximum likelihood estimation
qt-使用自带的应用框架建立--hello world--使用min GW 32bit
Generic configuration legend
Basic realization of line graph
Transportation, new infrastructure and smart highway
Two methods for full screen adaptation of background pictures, background size: cover; Or (background size: 100% 100%;)
5. Discrete control and continuous control
How to write mark down on vscode
Know how to get the traffic password
Complete model training routine
随机推荐
Getting started STM32 -- how to learn stm32
Arm bare metal
After modifying the background of jupyter notebook and adding jupyterthemes, enter 'JT -l' and the error 'JT' is not an internal or external command, nor a runnable program
Scalar / vector / matrix derivation method
HDMI to VGA acquisition HD adapter scheme | HDMI to VGA 1080p audio and video converter scheme | cs5210 scheme design explanation
Running OFDM in gnuradio_ RX error: gr:: Log: info: packet_ headerparser_ b0 - Detected an invalid packet at item ××
npm 內部拆分模塊
Introduction to the types and repair methods of chip Eco
Deep learning website
Gnuradio transmits video and displays it in real time using VLC
2022 safety officer-c certificate examination paper and safety officer-c certificate simulated examination question bank
2022 operation certificate examination for main principals of hazardous chemical business units and main principals of hazardous chemical business units
2021 tea master (primary) examination materials and tea master (primary) simulation test questions
Redis 主从复制
Solve the error: NPM warn config global ` --global`, `--local` are deprecated Use `--location=global` instead.
10. CNN applied to handwritten digit recognition
Share a latex online editor | with latex common templates
小金额炒股,在手机上开户安全吗?
2021-04-12 - new features lambda expression and function functional interface programming
Probability distribution