当前位置:网站首页>Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey 论文阅读笔记
Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey 论文阅读笔记
2022-07-03 02:39:00 【strawberry47】
这是一篇智慧交通领域的综述,侧重于讲解用强化学习解决交通信号灯管控 RL+TSC ;Traffic Signal Control :交通信号灯管控,
一. Overview
- 分类:
AI based transportation applications:
① management applications,
② public transportation,
③ autonomous vehicles
这部分还介绍了很多RL的基本概念,目标网络、经验回放等等,都是强化学习领域的常见知识点,可以看我其他笔记~
交通信号灯管控:
state:队伍长度、车辆位置、车辆速度
目标:最小化十字路口的堵塞
二. 交通信号灯控制表示为 Deep RL 问题
2.1 state:
- RGB图像,结合DQN;交叉口的快照(速度和位置)
- Image-like representation/discrete traffic state encoding (DTSE); 优点:包含了丰富的信息,速度、位置、信号灯、加速度
- feature-based value vector,向量表示;如:队伍长度、累计等待时间、一条车道的平均等待时间、信号灯持续时间、一条车道的车辆数
- 考虑更完整的道路信息

2.2 action:
一般是十字路口,需要考虑不同方向和持续时长;
四种绿灯阶段: North-South Green (NSG)南北方向通行, East-West Green (EWG)东西方向通行, North-South Advance Left Green (NSLG)南北方向可左转, East-West Advance Left Green (EWLG)东西方向可左转.
- 选择绿灯(四个方向中选一个绿灯)
- binary action:保持当前or换方向
- 更新每个阶段持续时长
Q:只关心绿灯吗?
A:有的论文简化成两个绿色阶段:南北绿色和东西绿色,忽略了左转
2.3 reward:
- 等待时间
- 累计延迟
- 队伍长度
- absolute value of the traffic data (流量数据)
2.4 Neural Network Structure:
- MLP
- CNN:和DQN结合
- RNN:序列数据
- AutoEncoder
2.5 仿真环境:
- 早期:Java-based Green Light District (GLD)
- 流行:Simulation Urban Mobility (SUMO)
- 成熟:VISSIM,AIMSUN(与MATLAB交互好)
三. Deep RL在交通信号灯控制中的应用

3.1 Standard RL Applications:
3.1.1 Single Agent Applications:
RL-based single intersection
会分成单交叉口和多交叉口交通
参考文献[57]将队伍长度作为state,总延迟时间作为reward;是第一篇binary action model;与固定时间的信号灯的场景对比
文献[60]首次提出真实交叉路口场景,提出了三种state定义。。。 四个reward function。
(这部分相当于related work)
3.1.2 Multi-Agent Applications
协同控制多个十字路口
- 四种标准TSC算法(应该是常用baseline):固定时间控制、随机控制、最长队伍优先、车辆最多优先
- 比较经典的算法(Wiering提出):TC-1,TC-2,TC-3
- state由红绿灯配置、车辆位置、车辆目的地构成,考虑到了局部和全局的特征(实际并不可行,因为车辆信息是未知的)
- 目的是减小等待时间
后续工作对Wiering工作的改进:
① 增加其他路口的堵塞信息
② 增加state size(通过增加堵塞信息)
③ 增加堵塞系数(instead of increasing the state space)
④ 加入堵塞和意外信息
⑤ 考虑协同信息
⑥ 多目标:vehicle stops, average waiting time, and maximum queue length are targeted as objectives for low, medium, and high traffic volume 根据不同场景进行不同设计reward functionKhamis的工作:
① 贝叶斯转移概率->reward function
② more specific objectives
③ seven objectives,加上了cooperative exploration functionrelated work:
① 分层强化学习
② R-Markov Average Reward
③ 考虑各个区域间的协同信息
3.2 Deep RL Applications:
3.2.1 Single Agent Applications:
3.2.2 Multi-Agent Deep RL:
四. DEEP RL FOR OTHER ITS APPLICATIONS
边栏推荐
- [fluent] future asynchronous programming (introduction | then method | exception capture | async, await keywords | whencomplete method | timeout method)
- Why choose a frame? What frame to choose
- [shutter] bottom navigation bar page frame (bottomnavigationbar bottom navigation bar | pageview sliding page | bottom navigation and sliding page associated operation)
- 线程安全的单例模式
- [advanced ROS] Lesson 6 recording and playback in ROS (rosbag)
- [shutter] setup of shutter development environment (supplement the latest information | the latest installation tutorial on August 25, 2021)
- Gbase 8C system table PG_ aggregate
- easyPOI
- Error invalid bound statement (not found): com ruoyi. stock. mapper. StockDetailMapper. XXXX solution
- Cancer biopsy instruments and kits - market status and future development trends
猜你喜欢
随机推荐
ASP. Net core 6 framework unveiling example demonstration [02]: application development based on routing, MVC and grpc
【翻译】具有集中控制平面的现代应用负载平衡
MATLAB小技巧(24)RBF,GRNN,PNN-神经网络
where 1=1 是什么意思
SQL statement
GBase 8c 函数/存储过程参数(一)
Awk from getting started to being buried (2) understand the built-in variables and the use of variables in awk
Can netstat still play like this?
javeScript 0.1 + 0.2 == 0.3的问题
cvpr2022去雨去雾
2022-2028 global splicing display industry research and trend analysis report
Gbase 8C system table PG_ class
Pytorch convolution network regularization dropblock
怎么将yolov5中的PANet层改为BiFPN
Apple releases MacOS 11.6.4 update: mainly security fixes
[fluent] futurebuilder asynchronous programming (futurebuilder construction method | asyncsnapshot asynchronous calculation)
为什么会选择框架?选择什么样的框架
Deep learning: multi-layer perceptron and XOR problem (pytoch Implementation)
[translation] the background project has joined the CNCF incubator
[fluent] JSON model conversion (JSON serialization tool | JSON manual serialization | writing dart model classes according to JSON | online automatic conversion of dart classes according to JSON)









