当前位置:网站首页>Hong Kong Polytechnic University | data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
Hong Kong Polytechnic University | data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
2022-07-03 16:27:00 【Zhiyuan community】
【 title 】Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
【 The author team 】C. Chen, Y.P. Huang, W.H.K. Lam, T.L. Pan, S.C. Hsu, A. Sumalee, R.X. Zhong
【 Date of publication 】2022.6.28
【 Thesis link 】https://www.sciencedirect.com/sdfe/reader/pii/S0968090X22001929/pdf
【 Recommended reasons 】 The existing data-driven and feedback flow control strategies do not consider the heterogeneity of real-time data measurement . Besides , Traditional traffic control reinforcement learning (RL) Due to the lack of data efficiency , Usually slow convergence . Moreover, the traditional optimal perimeter control scheme needs to accurately understand the system dynamics , Therefore, they are vulnerable to endogenous uncertainty . In this paper, we propose a holistic reinforcement learning (IRL) To learn macro traffic dynamics , To achieve adaptive optimal perimeter control . The main contribution of this paper is :(a) Continuous time control with discrete gain update is developed , To adapt to discrete-time sensor data .(b) In order to reduce sampling complexity and use available data more effectively , Replay experience (ER) Technology introduction IRL Algorithm .(c) The proposed method is based on “ No model ” The method relaxes the requirements for model calibration , Through data-driven RL The algorithm achieves robustness to modeling uncertainty and improves real-time performance .(d) be based on IRL The convergence of the algorithm and the stability of the controlled traffic dynamics are proved theoretically . The optimal control law is parameterized , Then through neural network (NN) Approaching , This reduces the computational complexity .
边栏推荐
- Rk3399 platform development series explanation (WiFi) 5.54. What is WiFi wireless LAN
- 远程文件包含实操
- Slam learning notes - build a complete gazebo multi machine simulation slam from scratch (I)
- Uploads labs range (with source code analysis) (under update)
- 2022 love analysis · panoramic report of digital manufacturers of state-owned enterprises
- Everyone in remote office works together to realize cooperative editing of materials and development of documents | community essay solicitation
- [combinatorics] combinatorial identities (sum of variable terms 3 combinatorial identities | sum of variable terms 4 combinatorial identities | binomial theorem + derivation to prove combinatorial ide
- 相同切入点的抽取
- NSQ source code installation and operation process
- Embedded development: seven reasons to avoid open source software
猜你喜欢

Interviewer: how does the JVM allocate and recycle off heap memory

Record a jar package conflict resolution process

Rk3399 platform development series explanation (WiFi) 5.54. What is WiFi wireless LAN

Slam learning notes - build a complete gazebo multi machine simulation slam from scratch (III)
![[redis foundation] understand redis master-slave architecture, sentinel mode and cluster together (Demo detailed explanation)](/img/1f/3dd95522b8d5f03dd763a6779e3db5.jpg)
[redis foundation] understand redis master-slave architecture, sentinel mode and cluster together (Demo detailed explanation)

Google Earth engine (GEE) - daymet v4: daily surface weather data set (1000m resolution) including data acquisition methods for each day

Deep understanding of grouping sets statements in SQL
![[statement] about searching sogk1997 and finding many web crawler results](/img/1a/8ed3ca0030ea227adcd95e8b306aca.png)
[statement] about searching sogk1997 and finding many web crawler results

Q2 encryption market investment and financing report in 2022: gamefi becomes an investment keyword

Unreal_ Datatable implements ID self increment and sets rowname
随机推荐
切入点表达式
疫情常态化大背景下,关于远程办公的思考|社区征文
[proteus simulation] 8 × 8LED dot matrix screen imitates elevator digital scrolling display
[web security] - [SQL injection] - error detection injection
Unreal_DataTable 实现Id自增与设置RowName
拼夕夕二面:说说布隆过滤器与布谷鸟过滤器?应用场景?我懵了。。
PHP CI (CodeIgniter) log level setting
The accept attribute of the El upload upload component restricts the file type (detailed explanation of the case)
线程池执行定时任务
为抵制 7-Zip,列出 “三宗罪” ?网友:“第3个才是重点吧?”
From the 18th line to the first line, the new story of the network security industry
探索Cassandra的去中心化分布式架构
Visual SLAM algorithms: a survey from 2010 to 2016
PHP secondary domain name session sharing scheme
Page dynamics [2]keyframes
特征多项式与常系数齐次线性递推
Rk3399 platform development series explanation (WiFi) 5.54. What is WiFi wireless LAN
面试官:JVM如何分配和回收堆外内存
一台服务器最大并发 tcp 连接数多少?65535?
Slam learning notes - build a complete gazebo multi machine simulation slam from scratch (II)