当前位置：网站首页>University of Electronic Science and technology | playback of clustering experience effectively used in reinforcement learning

University of Electronic Science and technology | playback of clustering experience effectively used in reinforcement learning

2022-07-03 20:42:00 【Zhiyuan community】

【 title 】Clustering Experience Replay for the Effective Exploitation in Reinforcement Learning

【 The author team 】Min Li, Tianyi Huang, William Zhu

【 Date of publication 】2022.6.27

【 Thesis link 】https://www.sciencedirect.com/science/article/pii/S0031320322003569

【 Recommended reasons 】 Reinforcement learning trains agents to make decisions by using the transformation experience generated by different decisions . In order to take advantage of this experience , Most reinforcement learning methods pass Replay the explored conversion through unified sampling . But in this way , It's easy to ignore the transformation of the final exploration . Another way to use this experience is to define the priority of each transformation through the estimation error in training , Then replay the conversion according to their priority . But it only updates the priority of the conversion replayed at the current training time step , Therefore, the conversion with lower priority will be ignored . This paper proposes a clustering experience playback , be called CER, Effectively use the experience hidden in all the transitions explored in the current training .CER The transformation is clustered and replayed through the divide and conquer framework based on time division . First , It divides the whole training process into several stages . secondly , At the end of each phase , It USES k-means Cluster the transitions explored at this stage . Last , It constructs a conditional probability density function , To ensure that various transitions can be fully replayed in the current training .

原网站

版权声明
本文为[Zhiyuan community]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/184/202207032031169246.html

当前位置：网站首页>University of Electronic Science and technology | playback of clustering experience effectively used in reinforcement learning

University of Electronic Science and technology | playback of clustering experience effectively used in reinforcement learning

边栏推荐

猜你喜欢

随机推荐