TY - GEN
T1 - Learning hierarchical behavior and motion planning for autonomous driving
AU - Wang, Jingke
AU - Wang, Yue
AU - Zhang, Dongkun
AU - Yang, Yezhou
AU - Xiong, Rong
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/10/24
Y1 - 2020/10/24
N2 - Learning-based driving solution, a new branch for autonomous driving, is expected to simplify the modeling of driving by learning the underlying mechanisms from data. To improve the tactical decision-making for learning-based driving solution, we introduce hierarchical behavior and motion planning (HBMP) to explicitly model the behavior in learning-based solution. Due to the coupled action space of behavior and motion, it is challenging to solve HBMP problem using reinforcement learning (RL) for long-horizon driving tasks. We transform HBMP problem by integrating a classical sampling-based motion planner, of which the optimal cost is regarded as the rewards for high-level behavior learning. As a result, this formulation reduces action space and diversifies the rewards without losing the optimality of HBMP. In addition, we propose a sharable representation for input sensory data across simulation platforms and real-world environment, so that models trained in a fast event-based simulator, SUMO, can be used to initialize and accelerate the RL training in a dynamics based simulator, CARLA. Experimental results demonstrate the effectiveness of the method. Besides, the model is successfully transferred to the real-world, validating the generalization capability.
AB - Learning-based driving solution, a new branch for autonomous driving, is expected to simplify the modeling of driving by learning the underlying mechanisms from data. To improve the tactical decision-making for learning-based driving solution, we introduce hierarchical behavior and motion planning (HBMP) to explicitly model the behavior in learning-based solution. Due to the coupled action space of behavior and motion, it is challenging to solve HBMP problem using reinforcement learning (RL) for long-horizon driving tasks. We transform HBMP problem by integrating a classical sampling-based motion planner, of which the optimal cost is regarded as the rewards for high-level behavior learning. As a result, this formulation reduces action space and diversifies the rewards without losing the optimality of HBMP. In addition, we propose a sharable representation for input sensory data across simulation platforms and real-world environment, so that models trained in a fast event-based simulator, SUMO, can be used to initialize and accelerate the RL training in a dynamics based simulator, CARLA. Experimental results demonstrate the effectiveness of the method. Besides, the model is successfully transferred to the real-world, validating the generalization capability.
UR - http://www.scopus.com/inward/record.url?scp=85102399270&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102399270&partnerID=8YFLogxK
U2 - 10.1109/IROS45743.2020.9341647
DO - 10.1109/IROS45743.2020.9341647
M3 - Conference contribution
AN - SCOPUS:85102399270
T3 - IEEE International Conference on Intelligent Robots and Systems
SP - 2235
EP - 2242
BT - 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2020
Y2 - 24 October 2020 through 24 January 2021
ER -