DTS-Swarm: Cross-Modality Policy Distillation for Robust Multi-UAV Target Tracking under Degraded Sensing

Main Article Content

Mohamed El Amine Ameur, Iyad Ameur, Tahar Allaoui

Abstract

Learned multi-UAV tracking policies often perform well in simulation but degrade when deployed under noisy, range-limited sensing. This paper presents DTS-Swarm, a distilled teacher-student transfer framework for robust multi-UAV target tracking. The teacher observes privileged simulator state, whereas the deployable student acts from a horizontal probabilistic occupancy map, its own kinematic state, and compact teammate-relative features. The method combines privileged teacher training, partial decoder-layer transfer, temperature-annealed Kullback-Leibler policy distillation, and a weak V-formation auxiliary loss. Across five degraded-sensing scenarios, evaluated over 30 episodes per seed and 5 random seeds with seed-level confidence intervals, DTS-Swarm reduces nominal target-wise tracking error by 32.6% relative to a no-transfer student and reduces high-noise tracking error by 44.1%. The main empirical finding is that hidden-layer transfer can produce negative transfer when action-distribution alignment is removed. In this cross-modality setting, copied teacher weights become useful only when KL distillation aligns the teacher and student action distributions during noisy-map fine-tuning.

Article Details

Section
Articles