Citizens' Spatial Footprint on Twitter-Anomaly, Trend and Bias Investigation in Istanbul

Gülnerman Gengeç A. G., Karaman H., Pekaslan D., Bilgi S.

ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, vol.9, 2020 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 9
  • Publication Date: 2020
  • Doi Number: 10.3390/ijgi9040222
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Agricultural & Environmental Science Database, CAB Abstracts, INSPEC, Veterinary Science Database, Directory of Open Access Journals
  • Istanbul Technical University Affiliated: Yes


Social media (SM) can be an invaluable resource in terms of understanding and managing the effects of catastrophic disasters. In order to use SM platforms for public participatory (PP) mapping of emergency management activities, a bias investigation should be undertaken with regard to the data related to the study area (urban, regional or national, etc.) to determine the spatial data dynamics. Thus, such determinations can be made on how SM can be used and interpreted in terms of PP. In this study, the city of Istanbul was chosen for social media data research area, as it is one of the most crowded cities in the world and expecting a major earthquake. The methodology for the data investigation is: 1. Obtain data and engage sampling, 2. Identify the representation and temporal biases in the data and normalize it in response to representation bias, 3. Identify general anomalies and spatial anomalies, 4. Manipulate the trend of the dataset with the discretization of anomalies and 5. Examine the spatiotemporal bias. Using this bias investigation methodology, citizen footprint dynamics in the city were determined and reference maps (most likely regional anomaly maps, representation maps, time-space bias maps, etc.) were produced. The outcomes of the study can be summarized in four steps. First, highly active users generate the majority of the data and removing this data as a general approach within a pseudo-cleaning process means concealing a large amount of data. Second, data normalization in terms of activity levels, changes the anomaly outcome resulting from diverse representation levels of users. Third, spatiotemporally normalized data present strong spatial anomaly tendency in some parts of the central area. Fourth, trend data is dense in the central area and the spatiotemporal bias assessments show the data density varies in terms of the time of day, day of week and season of the year. The methodology proposed in this study can be used to extract the unbiased daily routines of the social media data of the regions for the normal days and this can be referred for the emergency or unexpected event cases to detect the change or impacts.