当前位置：网站首页>[PPO attitude control] Simulink simulation of UAV attitude control based on reinforcement policy optimization PPO training

[PPO attitude control] Simulink simulation of UAV attitude control based on reinforcement policy optimization PPO training

2022-06-09 06:53:00 【FPGA and MATLAB】

1. Software version

matlab2019b

2. Theoretical knowledge of this algorithm

PPO The algorithm is based on OpenAI Proposed , This algorithm is a new strategy gradient （Policy Gradient） Algorithm , But the traditional strategy gradient algorithm is greatly affected by the step size , And it is difficult to choose the optimal step size , If during training , The difference between the new strategy and the old strategy will affect the final school effect . In response to this question ,PPO The algorithm proposes a new objective function , It can be updated in small batches through multiple training steps , Thus, the problem of step size selection in the traditional strategy gradient algorithm is solved . however PPO Algorithm , Its implementation complexity is much lower than TRPO Algorithm .PPO The implementation of the algorithm mainly includes 2 Type of implementation , The first one is PPO The algorithm is based on CPU Simulated , The second kind PPO The algorithm is based on GPU Accelerate simulation implementation , Its running speed is the first PPO More than three times the algorithm . Compared with the traditional neural network algorithm based on supervised learning, reinforcement learning network , The difficulty lies in the calculation of gradient function , Loss function calculation , however PPO Algorithm in the algorithm complexity , Achieve an optimal balance in terms of accuracy and ease of implementation .

such PPO The algorithm implementation process is relatively simple , It's similar to TRPO The formula of the algorithm , Restrictive operation by parameters .

原网站

版权声明
本文为[FPGA and MATLAB]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/160/202206090640047567.html

当前位置：网站首页>[PPO attitude control] Simulink simulation of UAV attitude control based on reinforcement policy optimization PPO training

[PPO attitude control] Simulink simulation of UAV attitude control based on reinforcement policy optimization PPO training

1. Software version

2. Theoretical knowledge of this algorithm

边栏推荐

猜你喜欢

随机推荐