A deep reinforcement learning approach for the meal delivery problem

Jahanshahi, Hadi; Bozanta, Aysun; Cevik, Mucahit; Kavuk, Eray; Tosun Kühn, Ayşe; Sonuc, Sibel; Kosucu, Bilgin; Basar, Ayse

doi:10.1016/j.knosys.2022.108489

A deep reinforcement learning approach for the meal delivery problem

Atıf İçin Kopyala

Jahanshahi H., Bozanta A., Cevik M., Kavuk E. M., Tosun Kühn A., Sonuc S. B., ...Daha Fazla

KNOWLEDGE-BASED SYSTEMS, cilt.243, 2022 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 243
Basım Tarihi: 2022
Doi Numarası: 10.1016/j.knosys.2022.108489
Dergi Adı: KNOWLEDGE-BASED SYSTEMS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Computer & Applied Sciences, INSPEC, Library and Information Science Abstracts, Library, Information Science & Technology Abstracts (LISTA)
Anahtar Kelimeler: Meal delivery, Courier assignment, Reinforcement learning, DQN, DDQN
İstanbul Teknik Üniversitesi Adresli: Evet

Özet

We consider a meal delivery service fulfilling dynamic customer requests given a set of couriers over the course of a day. A courier's duty is to pick up an order from a restaurant and deliver it to a customer. We model this service as a Markov decision process and use deep reinforcement learning as the solution approach. We experiment with the resulting policies on synthetic and real-world datasets and compare those with the baseline policies. We also examine the courier utilization for different numbers of couriers. In our analysis, we specifically focus on the impact of the limited available resources in the meal delivery problem. Furthermore, we investigate the effect of intelligent order rejection and re-positioning of the couriers. Our numerical experiments show that, by incorporating the geographical locations of the restaurants, customers, and the depot, our model significantly improves the overall service quality as characterized by the expected total reward and the delivery times. Our results present valuable insights on both the courier assignment process and the optimal number of couriers for different order frequencies on a given day. The proposed model also shows a robust performance under a variety of scenarios for real-world implementation. (C) 2022 Elsevier B.V. All rights reserved.