Cross-dataset person re-identification using deep convolutional neural networks: effects of context and domain adaptation


Genc A., Ekenel H. K.

MULTIMEDIA TOOLS AND APPLICATIONS, cilt.78, sa.5, ss.5843-5861, 2019 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 78 Sayı: 5
  • Basım Tarihi: 2019
  • Doi Numarası: 10.1007/s11042-018-6409-3
  • Dergi Adı: MULTIMEDIA TOOLS AND APPLICATIONS
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
  • Sayfa Sayıları: ss.5843-5861
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Over the past years, the impact of surveillance systems on public safety increases dramatically. One significant challenge in this domain is person re-identification, which aims to detect whether a person has already been captured by another camera in the surveillance network or not. Most of the work that has been conducted on person re-identification problem uses a single dataset, in which the training and test data are coming from the same source. However, as we have shown in this work, there is a strong bias among the person re-identification datasets, therefore, a method that has been trained and optimized on a specific person re-identification dataset may not generalize well and perform successfully on the other datasets. This is a problem for many real-world applications, since it is not feasible to collect and annotate sufficient amount of data from the target application to train or fine-tune a deep convolutional neural network model. Taking this issue into account, in this work, we have focused on cross-dataset person re-identification problem and first explored and analyzed in detail the use of the state-of-the-art deep convolutional neural network architectures, namely AlexNet, VGGNet, GoogLeNet, ResNet, and DenseNet that have been developed for generic image classification task. These deep CNN models have been adapted to the person re-identification domain by fine-tuning them for each human body part separately, as well as on the entire body, with the two relatively large person re-identification datasets: CUHK03 and Market-1501. Then, the performance of each adapted model has been evaluated on two different publicly available datasets: VIPeR and PRID2011. We have shown that, even just a domain adaptation leads comparable results to the state-of-the-art cross-dataset approaches. Another point that we have addressed in this paper is context adaptation. It has been known that person re-identification approaches implicitly utilizes background as context information. Therefore, to have a consistent background across different camera views, we have employed the cycle-consistent generative adversarial network. We have shown that this further improves the performance.