当前位置:网站首页>Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey 论文阅读笔记
Deep Reinforcement Learning for Intelligent Transportation Systems: A Survey 论文阅读笔记
2022-07-03 02:39:00 【strawberry47】
这是一篇智慧交通领域的综述,侧重于讲解用强化学习解决交通信号灯管控 RL+TSC ;Traffic Signal Control :交通信号灯管控,
一. Overview
- 分类:
AI based transportation applications:
① management applications,
② public transportation,
③ autonomous vehicles
这部分还介绍了很多RL的基本概念,目标网络、经验回放等等,都是强化学习领域的常见知识点,可以看我其他笔记~
交通信号灯管控:
state:队伍长度、车辆位置、车辆速度
目标:最小化十字路口的堵塞
二. 交通信号灯控制表示为 Deep RL 问题
2.1 state:
- RGB图像,结合DQN;交叉口的快照(速度和位置)
- Image-like representation/discrete traffic state encoding (DTSE); 优点:包含了丰富的信息,速度、位置、信号灯、加速度
- feature-based value vector,向量表示;如:队伍长度、累计等待时间、一条车道的平均等待时间、信号灯持续时间、一条车道的车辆数
- 考虑更完整的道路信息
2.2 action:
一般是十字路口,需要考虑不同方向和持续时长;
四种绿灯阶段: North-South Green (NSG)南北方向通行, East-West Green (EWG)东西方向通行, North-South Advance Left Green (NSLG)南北方向可左转, East-West Advance Left Green (EWLG)东西方向可左转.
- 选择绿灯(四个方向中选一个绿灯)
- binary action:保持当前or换方向
- 更新每个阶段持续时长
Q:只关心绿灯吗?
A:有的论文简化成两个绿色阶段:南北绿色和东西绿色,忽略了左转
2.3 reward:
- 等待时间
- 累计延迟
- 队伍长度
- absolute value of the traffic data (流量数据)
2.4 Neural Network Structure:
- MLP
- CNN:和DQN结合
- RNN:序列数据
- AutoEncoder
2.5 仿真环境:
- 早期:Java-based Green Light District (GLD)
- 流行:Simulation Urban Mobility (SUMO)
- 成熟:VISSIM,AIMSUN(与MATLAB交互好)
三. Deep RL在交通信号灯控制中的应用
3.1 Standard RL Applications:
3.1.1 Single Agent Applications:
RL-based single intersection
会分成单交叉口和多交叉口交通
参考文献[57]将队伍长度作为state,总延迟时间作为reward;是第一篇binary action model;与固定时间的信号灯的场景对比
文献[60]首次提出真实交叉路口场景,提出了三种state定义。。。 四个reward function。
(这部分相当于related work)
3.1.2 Multi-Agent Applications
协同控制多个十字路口
- 四种标准TSC算法(应该是常用baseline):固定时间控制、随机控制、最长队伍优先、车辆最多优先
- 比较经典的算法(Wiering提出):TC-1,TC-2,TC-3
- state由红绿灯配置、车辆位置、车辆目的地构成,考虑到了局部和全局的特征(实际并不可行,因为车辆信息是未知的)
- 目的是减小等待时间
后续工作对Wiering工作的改进:
① 增加其他路口的堵塞信息
② 增加state size(通过增加堵塞信息)
③ 增加堵塞系数(instead of increasing the state space)
④ 加入堵塞和意外信息
⑤ 考虑协同信息
⑥ 多目标:vehicle stops, average waiting time, and maximum queue length are targeted as objectives for low, medium, and high traffic volume 根据不同场景进行不同设计reward functionKhamis的工作:
① 贝叶斯转移概率->reward function
② more specific objectives
③ seven objectives,加上了cooperative exploration functionrelated work:
① 分层强化学习
② R-Markov Average Reward
③ 考虑各个区域间的协同信息
3.2 Deep RL Applications:
3.2.1 Single Agent Applications:
3.2.2 Multi-Agent Deep RL:
四. DEEP RL FOR OTHER ITS APPLICATIONS
边栏推荐
- awk从入门到入土(2)认识awk内置变量和变量的使用
- random shuffle注意
- Pytest (6) -fixture (Firmware)
- Two dimensional format array format index subscript continuity problem leads to return JSON format problem
- 【教程】chrome關閉跨域策略cors、samesite,跨域帶上cookie
- Choose it when you decide
- Current situation and future of Web3 in various countries
- Can netstat still play like this?
- require. context
- [Flutter] dart: class; abstract class; factory; Class, abstract class, factory constructor
猜你喜欢
"Analysis of 43 cases of MATLAB neural network": Chapter 43 efficient programming skills of neural network -- Discussion Based on the characteristics of the new version of MATLAB r2012b
基于线程池的生产者消费者模型(含阻塞队列)
[translation] modern application load balancing with centralized control plane
[shutter] bottom navigation bar page frame (bottomnavigationbar bottom navigation bar | pageview sliding page | bottom navigation and sliding page associated operation)
【Flutter】shared_ Preferences local storage (introduction | install the shared_preferences plug-in | use the shared_preferences process)
Linear rectification function relu and its variants in deep learning activation function
Matlab tips (24) RBF, GRNN, PNN neural network
Classes and objects - initialization and cleanup of objects - constructor call rules
[Flutter] dart: class; abstract class; factory; Class, abstract class, factory constructor
random shuffle注意
随机推荐
错误Invalid bound statement (not found): com.ruoyi.stock.mapper.StockDetailMapper.xxxx解决
为什么会选择框架?选择什么样的框架
GBase 8c触发器(三)
My creation anniversary
Strategy application of Dameng database
MUX VLAN Foundation
Cvpr2022 remove rain and fog
Gbase 8C system table PG_ collation
Pytest (6) -fixture (Firmware)
Gbase 8C create user / role example 2
GBase 8c系统表-pg_aggregate
oauth2.0鉴权,登录访问 “/oauth/token”,请求头Authorization(basicToken)如何取值???
Monitoring and management of JVM
cvpr2022去雨去雾
二维格式数组格式索引下标连续问题导致 返回json 格式问题
leetcode540
[shutter] bottom navigation bar page frame (bottomnavigationbar bottom navigation bar | pageview sliding page | bottom navigation and sliding page associated operation)
Face recognition 6-face_ recognition_ Py based on OpenCV, face detection and real-time tracking using Haar cascade and Dlib Library
Xiaodi notes
easyPOI