Super-resolution with generative adversarial networks for improved object detection in aerial images


Haykir A. A., Öksüz İ.

INFORMATION DISCOVERY AND DELIVERY, 2022 (ESCI) identifier identifier

  • Publication Type: Article / Article
  • Publication Date: 2022
  • Doi Number: 10.1108/idd-05-2022-0048
  • Journal Name: INFORMATION DISCOVERY AND DELIVERY
  • Journal Indexes: Emerging Sources Citation Index (ESCI), Scopus, FRANCIS, ABI/INFORM, Aerospace Database, Communication Abstracts, Information Science and Technology Abstracts, INSPEC, Library and Information Science Abstracts, Library Literature and Information Science, Library, Information Science & Technology Abstracts (LISTA), Metadex, DIALNET, Civil Engineering Abstracts
  • Keywords: Data quality, Aerial images, Super-resolution, Object detection, Generative adversarial networks, Perceptual quality
  • Istanbul Technical University Affiliated: Yes

Abstract

PurposeData quality and data resolution are essential for computer vision tasks like medical image processing, object detection, pattern recognition and so on. Super-resolution is a way to increase the image resolution, and super-resolved images contain more information compared to their low-resolution counterparts. The purpose of this study is analyzing the effects of the super resolution models trained before on object detection for aerial images. Design/methodology/approachTwo different models were trained using the Super-Resolution Generative Adversarial Network (SRGAN) architecture on two aerial image data sets, the xView and the Dataset for Object deTection in Aerial images (DOTA). This study uses these models to increase the resolution of aerial images for improving object detection performance. This study analyzes the effects of the model with the best perceptual index (PI) and the model with the best RMSE on object detection in detail. FindingsSuper-resolution increases the object detection quality as expected. But, the super-resolution model with better perceptual quality achieves lower mean average precision results compared to the model with better RMSE. It means that the model with a better PI is more meaningful to human perception but less meaningful to computer vision. Originality/valueThe contributions of the authors to the literature are threefold. First, they do a wide analysis of SRGAN results for aerial image super-resolution on the task of object detection. Second, they compare super-resolution models with best PI and best RMSE to showcase the differences on object detection performance as a downstream task first time in the literature. Finally, they use a transfer learning approach for super-resolution to improve the performance of object detection.