Investigating the performance of personalized models for software defect prediction

Eken B., Tosun Kühn A.

JOURNAL OF SYSTEMS AND SOFTWARE, vol.181, 2021 (SCI-Expanded) identifier identifier

  • Publication Type: Article / Article
  • Volume: 181
  • Publication Date: 2021
  • Doi Number: 10.1016/j.jss.2021.111038
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Scopus, ABI/INFORM, Aerospace Database, Applied Science & Technology Source, Business Source Elite, Business Source Premier, Communication Abstracts, Computer & Applied Sciences, INSPEC, Metadex, Civil Engineering Abstracts
  • Keywords: Personalized, Change-level, Defect prediction, Software recommendation systems, FRAMEWORK, IMPACT, SIZE
  • Istanbul Technical University Affiliated: Yes


Software defect predictors exploring developer perspective reveal that code changes made by separate developers tend to have different defect patterns. Personalized defect prediction also contributes to this view and gives promising results. We aim to investigate the performance of personalized defect predictors compared to those of traditional models. We conduct an empirical study on six open-source projects for 222 developers. Personalized and traditional defect predictors are built utilizing two algorithms and cross-validation on the historical commit data, and assessed via seven performance measures and statistical tests. Our results show that personalized models (PMs) achieve an increase of up to 24% in recall for 83% of developers, while causing higher false alarm rates for 77% of developers. PMs are better for those developers who contribute to the modules with many prior contributors. Although size metrics contribute to the performance of the majority of the PMs, they significantly differ in terms of information gained from experience, diffusion and history metrics, respectively. The decision of whether a PM should be chosen over a traditional model depends on a set of factors, i.e., selected algorithm, model validation strategy or performance measures, and hence, PM performance significantly differs regarding these factors. (C) 2021 Elsevier Inc. All rights reserved.