Toward Learning Human-Like, Safe and Comfortable Car-Following Policies With a Novel Deep Reinforcement Learning Approach

Creative Commons License

Yavas M. U., Kumbasar T., Üre N. K.

IEEE Access, vol.11, pp.16843-16854, 2023 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 11
  • Publication Date: 2023
  • Doi Number: 10.1109/access.2023.3245831
  • Journal Name: IEEE Access
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
  • Page Numbers: pp.16843-16854
  • Keywords: Vehicles, Cruise control, Behavioral sciences, Advanced driver assistance systems, Tuning, Safety, Heuristic algorithms, Reinforcement learning, Deep learning, Adaptive cruise control, reinforcement learning, deep learning, naturalistic driving, advanced driving assistance systems
  • Istanbul Technical University Affiliated: Yes


In this paper, we present an advanced adaptive cruise control (ACC) concept powered by Deep Reinforcement Learning (DRL) that generates safe, human-like, and comfortable car-following policies. Unlike the current trend in developing DRL-based ACC systems, we propose defining the action space of the DRL agent with discrete actions rather than continuous ones, since human drivers never set the throttle/brake pedal level to be actuated, but rather the required change of the current pedal levels. Through this human-like throttle-brake manipulation representation, we also define explicit actions for holding (keeping the last action) and coasting (no action), which are usually omitted as actions in ACC systems. Moreover, based on the investigation of a real-world driving dataset, we cast a novel reward function that is easy to interpret and personalized. The proposed reward enforces the agent to learn stable and safe actions, while also encouraging the holding and coasting actions, just like a human driver would. The proposed discrete action DRL agent is trained with action masking, and the reward terms are completely derived from the real-world dataset collected from a human driver. We present exhaustive comparative results to show the advantages of the proposed DRL approach in both simulation and scenarios extracted from real-world driving. We clearly show that the proposed policy imitates human driving significantly better and handles complex driving situations, such as cut-ins and cut-outs, implicitly, in comparison with a DRL agent trained with a widely-used reward function proposed for ACC, a model predictive control structure, and traditional car-following approaches.