Investigating the Impact of Development Task on External Quality in Test-Driven Development: An Industry Experiment

Tosun Kühn A., Dieste O., Vegas S., Pfahl D., Rungi K., Juristo N.

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, vol.47, no.11, pp.2438-2456, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 47 Issue: 11
  • Publication Date: 2021
  • Doi Number: 10.1109/tse.2019.2949811
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE, Metadex, zbMATH, Civil Engineering Abstracts
  • Page Numbers: pp.2438-2456
  • Keywords: Task analysis, Industries, Bibliographies, Productivity, Programming profession, Organizations, Test-driven development, industry experiment, experimental task, incremental test-last development, external quality, PRODUCTIVITY
  • Istanbul Technical University Affiliated: Yes


Reviews on test-driven development (TDD) studies suggest that the conflicting results reported in the literature are due to unobserved factors, such as the tasks used in the experiments, and highlight that there are very few industry experiments conducted with professionals. The goal of this study is to investigate the impact of a new factor, the chosen task, and the development approach on external quality in an industrial experimental setting with 17 professionals. The participants are junior to senior developers in programming with Java, beginner to novice in unit testing, JUnit, and they have no prior experience in TDD. The experimental design is a 2 x 2 cross-over, i.e., we use two tasks for each of the two approaches, namely TDD and incremental test-last development (ITLD). Our results reveal that both development approach and task are significant factors with regards to the external quality achieved by the participants. More specifically, the participants produce higher quality code during ITLD in which splitting user stories into subtasks, coding, and testing activities are followed, compared to TDD. The results also indicate that the participants produce higher quality code during the implementation of Bowling Score Keeper, compared to that of Mars Rover API, although they perceived both tasks as of similar complexity. An interaction between the development approach and task could not be observed in this experiment. We conclude that variables that have not been explored so often, such as the extent to which the task is specified in terms of smaller subtasks, and developers' unit testing experience might be critical factors in TDD experiments. The real-world appliance of TDD and its implications on external quality still remain to be challenging unless these uncontrolled and unconsidered factors are further investigated by researchers in both academic and industrial settings.