IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, vol.12, no.4, pp.1609-1623, 2011 (SCI-Expanded)
This paper considers a comprehensive and collaborative project to collect large amounts of driving data on the road for use in a wide range of areas of vehicle-related research centered on driving behavior. Unlike previous data collection efforts, the corpora collected here contain both human and vehicle sensor data, together with rich and continuous transcriptions. While most efforts on in-vehicle research are generally focused within individual countries, this effort links a collaborative team from three diverse regions (i.e., Asia, American, and Europe). Details relating to the data collection paradigm, such as sensors, driver information, routes, and transcription protocols, are discussed, and a preliminary analysis of the data across the three data collection sites from the U.S. (Dallas), Japan (Nagoya), and Turkey (Istanbul) is provided. The usability of the corpora has been experimentally verified with a Cohen's kappa coefficient of 0.74 for transcription reliability, as well as being successfully exploited for several in-vehicle applications. Most importantly, the corpora are publicly available for research use and represent one of the first multination efforts to share resources and understand driver characteristics. Future work on distributing the corpora to the wider research community is also discussed.