Validity and reliability of naturalistic driving scene categorization judgments from crowdsourcing

Christopher D. Cabrall
Zhenji Lu
Miltos Kyriakidis
Laura Manca
Chris Dijksterhuis
Riender Happee
Joost de Winter

A common challenge with processing naturalistic driving data for many different possible driving research interests or applications is that humans may need to categorize great volumes of recorded visual information until automated algorithms might be trained to do so alone.

This study, by means of the online platform CrowdFlower, investigated the potential of crowdsourcing to provide content identification categorizations of driving scene features (e.g., presence of another vehicle, straight road segments, etc.) at greater scale than a single person or a small team of researchers would be capable of. The validity and reliability of CrowdFlower results were examined, both with and without employing a set of randomly embedded controlled questions (Gold Test Questions) intermixed with experimental questions (Work Mode). In total, 200 workers from 46 countries participated in this study, and the collection of data lasted one and a half days.

By employing Gold Test Questions, we found significantly more accurate and consistent responses from external workers at both a smaller and larger scale of video segment categorizations for the identification of common driving scene elements (e.g., position and behavior of other vehicles, road and signage characteristics, etc.). In terms of validity and at the small scale, an average accuracy of 91% on paired items was found with the controlled questions compared to 78% without. A difference in bias was found where without Gold Test Questions external workers returned more false positives than false negatives whereas the opposite was found true of the condition with Gold Test Questions. At the large scale (making use of the controlled questions), a random subset of categorizations returned similar levels of accuracy (95%) and a similar pattern of error bias. In terms of reliability and at the small scale, where segments were rated in triplicate redundancy, the percentage of unanimous agreement was found significantly higher when using controlled questions (90%) than without them (65%). Across the small scale of internally validated answers, more than two-thirds of any correct categorization were unanimously returned and 86% or more of any correct categorization was returned by a majority vote. Where it would be infeasible to validate every response for accuracy, similar voting reliability results were found to exist across the responses of the large scale.

Overall results support compelling evidence for CrowdFlower as being able to yield valid and reliable crowdsourced categorizations of naturalistic driving scene contents in a short period of time and thus a potentially powerful and as-of-yet under-utilized resource in the toolbox of driving research and driving automation development.

LATEST NEWS


2017-11-30

Millions for research into maritime transport and the environment

Maritime transport is a major source of emissions of harmful air pollutants and carbon dioxide. In a new project, a research team from the Swedish National Road and Transport Research Institute (VTI) and the University of Gothenburg has received SEK 6.4...


2017-11-30

New research programme for more efficient travel

The Swedish National Road and Transport Research Institute (VTI) is playing an important role in a major new research programme to find radical solutions leading to fewer trips and more efficient travel, along with tools to enable better use of roads and...


2017-11-30

Simulator used to practice emergency responses safely

Emergency responses of the police, ambulance, and rescue services are associated with a high risk of accidents, but practicing them in real traffic is neither safe nor permissible. A simulator-based method developed by the Swedish National Road and Transport...


2017-10-26

Simulation of cut-in by manually driven vehicles in platooning scenarios

A study in a VTI-driving simulator has showed that a platoon will be able to handle a cut in from a manually driven car. The results of this study have recently been presented at two conferences in Japan.


2017-10-16

ERPUG Forum

The five-year anniversary of European Road Profile Users' Group (ERPUG) Forum will take place at Ramboll head quarter, Copenhagen, Denmark October 19-20, 2017.


2017-09-29

Self-driving buses in Sweden next year?

A self-driving, fossil-free bus. This idea might become reality through a forthcoming collaborative project involving the Swedish National Road and Transport Research Institute (VTI), Linköping University and several other participants. The project group aim...