A Detail survey on Pedestrian Trajectory Prediction in Smart Cities for Real-Time Surveillance
Main Article Content
Abstract
Pedestrian trajectory prediction has become a foundational capability for intelligent systems operational within dynamic human environments, such as autonomous vehicles, service robots, and smart surveillance platforms. This paper systematically examines the recent advances in pedestrian trajectory prediction, with a particular focus on spatiotemporal deep learning architectures designed for real-time and edge-based deployment. While temporal dependencies have been modelled using recurrent and probabilistic approaches, these methods often struggle with the computational inefficiency and the limited spatial context modeling. In contrast, transformer-based networks and 3D convolutional neural networks (3D CNNs) provide richer spatiotemporal representations but face challenges in scalability and deployment on embedded systems. The reviewed literature is categorized into primary methodological frameworks, recurrent, convolutional, hybrid, and transformer-based models. The frameworks have achieved major advancements in three areas which include social interaction modeling, multimodal prediction, and contextual scene understanding. The evaluation of models using standard benchmarks such as ETH/UCY and the Stanford Drone Dataset shows that prediction accuracy and inference latency and model generalization have important trade-offs. The field faces ongoing obstacles which include difficulties with extended future predictions and challenges in handling uncommon behaviors and the need for real-world performance. The research paper provides a roadmap which will help develop lightweight 3D CNN architectures that use temporal residual connections to support efficient spatiotemporal reasoning on edge devices with limited resources.