"Scalable Truck-Drone Coordination with Reinforcement Learning: Toward Real-Time Last-Mile Delivery Optimization"

Main Article Content

Ali Abdul Razzaq Taresh, Asghar A. Asgharian Sardroud, Saman Tajbakhsh

Abstract

The study offers a new reinforcement learning contour to adapt the traveling seller problem with a drone (TSP-D), which faces challenges in final delivery logistics. The proposed method benefits from a near proximal political adaptation (PPO) combined with a deep remaining forward nerve architecture to effectively coordinate truck-drainage operations. A large contribution lies in extended state representation, which integrates the position of transit, the remaining travel time and viable drone-causal nodes, which use the algorithm of Dijkstra. The model was evaluated under two scenarios: unlimited and limited drone area. The results show calculation performance on better routing efficiency, solution quality and benchmark algorithms. PPO ensures policy stability and effective learning in a complex urban environment. The residual architecture reduces the disappearance of the disappearance, which enables intensive network training. This approach supports scalable and intelligent decision -making for drone -assisted logistics. The study helps to promote real -time, data -driven distribution systems for final measurement.

Article Details

Section
Articles