当前位置:网站首页>香港理工大学|数据高效的强化学习和网络流量动态的自适应最优周界控制
香港理工大学|数据高效的强化学习和网络流量动态的自适应最优周界控制
2022-07-03 16:25:00 【智源社区】
【标题】Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
【作者团队】C. Chen, Y.P. Huang, W.H.K. Lam, T.L. Pan, S.C. Hsu, A. Sumalee, R.X. Zhong
【发表日期】2022.6.28
【论文链接】https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22001929/pdf
【推荐理由】现有的数据驱动和反馈流量控制策略没有考虑实时数据测量的异构性。此外,传统的交通控制强化学习(RL)方法由于缺乏数据效率,通常收敛缓慢。而且传统的最优周界控制方案需要准确了解系统动力学,因此它们容易受到内生不确定性的影响。本文提出了一种基于整体强化学习 (IRL) 的方法来学习宏观交通动态,以实现自适应最优周界控制。本文主要贡献:(a)开发了具有离散增益更新的连续时间控制,以适应离散时间传感器数据。(b) 为了降低采样复杂度并更有效地使用可用数据,将经验重放 (ER) 技术引入 IRL 算法。(c) 所提出的方法以“无模型”的方式放宽了对模型校准的要求,通过数据驱动的 RL 算法实现了对建模不确定性的鲁棒性并提高了实时性能。(d) 基于 IRL 的算法的收敛性和受控交通动态的稳定性得到理论证明。最优控制律被参数化,然后通过神经网络 (NN) 进行逼近,从而降低了计算复杂度。
边栏推荐
- Is it safe to open an account with flush?
- Develop team OKR in the way of "crowdfunding"
- Why can't strings be directly compared with equals; Why can't some integers be directly compared with the equal sign
- From "zero sum game" to "positive sum game", PAAS triggered the third wave of cloud computing
- [list to map] collectors Tomap syntax sharing (case practice)
- 14 topics for performance interviews between superiors and subordinates (4)
- Expression of request header in different countries and languages
- Construction practice camp - graduation summary of phase 6
- Qt插件之自定义插件构建和使用
- Extraction of the same pointcut
猜你喜欢

MB10M-ASEMI整流桥MB10M

TCP擁塞控制詳解 | 3. 設計空間

初试scikit-learn库

Mysql 单表字段重复数据取最新一条sql语句

Embedded development: seven reasons to avoid open source software

【LeetCode】94. Middle order traversal of binary tree

QT串口ui设计和解决显示中文乱码

"Remake Apple product UI with Android" (3) - elegant statistical chart

于文文、胡夏等明星带你玩转派对 皮皮APP点燃你的夏日

NFT新的契机,多媒体NFT聚合平台OKALEIDO即将上线
随机推荐
Top k questions of interview
Caching mechanism of Hibernate / session level caching mechanism
PHP CI(CodeIgniter)log级别设置
Qt插件之自定义插件构建和使用
Unity project optimization case 1
PHP CI (CodeIgniter) log level setting
How to use AAB to APK and APK to AAB of Google play apps on the shelves
Using optimistic lock and pessimistic lock in MySQL to realize distributed lock
无心剑中译泰戈尔《漂鸟集(1~10)》
NFT新的契机,多媒体NFT聚合平台OKALEIDO即将上线
QT串口ui设计和解决显示中文乱码
First!! Is lancet hungry? Official documents
pycharm错Error updating package list: connect timed out
NFT new opportunity, multimedia NFT aggregation platform okaleido will be launched soon
8 tips for effective performance evaluation
[combinatorics] combinatorial identities (review of eight combinatorial identities | product of combinatorial identities 1 | proof | use scenario | general method for finding combinatorial numbers)
跟我学企业级flutter项目:简化框架demo参考
[proteus simulation] 8 × 8LED dot matrix screen imitates elevator digital scrolling display
Interviewer: how does the JVM allocate and recycle off heap memory
Page dynamics [2]keyframes