Optimizing Deep Reinforcement Learning and Computer Vision for Drone Navigation

Nnadi Kingsley Ifeanyi; Oleka Chioma Violet

doi:10.51584/IJRIAS.2026.11030020

Optimizing Deep Reinforcement Learning and Computer Vision for Drone Navigation

by Nnadi Kingsley Ifeanyi, Oleka Chioma Violet

Published: March 31, 2026 • DOI: 10.51584/IJRIAS.2026.11030020

Abstract

The fast development of autonomous aerial systems has given the focus on the necessity of intelligent navigation methods that would be able to operate in complex, dynamic, and unstructured environments. This paper is dedicated to the optimization of Deep Reinforcement Learning (DRL) with the usage of Computer Vision (CV) in autonomous drone navigation with the focus on the simulation-based testing of the model performance before its practical application.. The study has followed a simulation-based approach with a combination of Proximal Policy Optimization (PPO) of reinforcement learning and computer vision models of convolutional neural networks (ResNet50 and YOLOv5). The simulation environment is a recreation of diverse scenarios such as indoor, urban, forested and open field scenarios. It has 500 flight episodes, in which the performance of the UAV is measured with the key metrics including rate of reaching target, number of collisions, total rewards, navigation accuracy, convergence rate of DRL algorithms, and power consumption. The most important results show that the integrated DRA-CV model reached a target-reaching success rate of 96 percent and also had an average number of collisions in the form of 1.33 per episode. PPO algorithm performed better than DQN, A3C and SAC in terms of convergence, and optimal policies were obtained on an average of 177.5 episodes. CNN-based visual perception was able to identify obstacles with 94 percent accuracy of the obstacle with a low rate of false positive (3 percent) and false negative (2 percent) to navigate a dynamic environment safely. The average cumulative reward was 1847 units and the energy consumption was optimised to 1184.7 Joules which proved to be an efficient use of resources..

Download PDF