DEEP LEARNING-BASED DOOR AND WINDOW DETECTION FROM BUILDING FAÇADE


Creative Commons License

Sezen G., Çakır M. A., Atik M. E., Duran Z.

2022 24th ISPRS Congress on Imaging Today, Foreseeing Tomorrow, Commission IV, Nice, France, 6 - 11 June 2022, vol.43, pp.315-320 identifier

  • Publication Type: Conference Paper / Full Text
  • Volume: 43
  • Doi Number: 10.5194/isprs-archives-xliii-b4-2022-315-2022
  • City: Nice
  • Country: France
  • Page Numbers: pp.315-320
  • Keywords: Building Façade Elements, Deep learning, Faster R-CNN, Object Detection, YOLO
  • Istanbul Technical University Affiliated: Yes

Abstract

© 2022 G. Sezen et al.Detecting building façade elements is a crucial problem in computer vision for image interpretation. In Building Information Modeling (BIM) studies, the detection of building façade elements has an important role. BIM is a tool that allows maintaining a digital representation of all aspects of building information; therefore, it will enable the storage of almost any data related to a given structure, regarding its geometric and non-geometric aspects. Façade segmentation was first studied in the 1970s using hand-crafted expertise. Later, detection and segmentation studies emerged based on shapes of objects and parametric rules. With the developing technology, deep learning approaches in object detection studies have intensified. It is obvious that the desired analyses can be performed faster with deep learning approaches. However, deep learning methods require large training data. Algorithms that consider different situations and are suitable for real-world scenarios continue to be developed. The need in this direction continues in the literature. In this study, door and window detection was carried out with deep learning on an original data set. The algorithms used are YOLOv3, YOLOv4, YOLOv5, and Faster R-CNN. Precision, recall and mean average precision (mAP) are used as evaluation metrics. As a result of the study, precision, recall, and mAP values with YOLOv5 were obtained as 0.85, 0.72, and 0.79, respectively. With Faster R-CNN with the lowest performance, precision, recall, and mAP were obtained as 0.54, 0.63, and 0.54, respectively.