An empirical study on the effect of community smells on bug prediction


Eken B., Palma F., Ayse B., Ayse T.

SOFTWARE QUALITY JOURNAL, cilt.29, sa.1, ss.159-194, 2021 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 29 Sayı: 1
  • Basım Tarihi: 2021
  • Doi Numarası: 10.1007/s11219-020-09538-7
  • Dergi Adı: SOFTWARE QUALITY JOURNAL
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, ABI/INFORM, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC
  • Sayfa Sayıları: ss.159-194
  • Anahtar Kelimeler: Community smells, Bug prediction, Mining software repositories
  • İstanbul Teknik Üniversitesi Adresli: Evet

Özet

Community-aware metrics through socio-technical developer networks or organizational structures have already been studied in the software bug prediction field. Community smells are also proposed to identify communication and collaboration patterns in developer communities. Prior work reports a statistical association between community smells and code smells identified in software modules. We investigate the contribution of community smells on predicting bug-prone classes and compare their contribution with that of code smell-related information and state-of-the-art process metrics. We conduct our empirical analysis on ten open-source projects with varying sizes, buggy and smelly class ratios. We build seven different bug prediction models to answer three RQs: a baseline model including a state-of-the-art metric set used, three models incorporating a particular metric set, namely community smells, code smells, code smell intensity, into the baseline, and three models incorporating a combination of smell-related metrics into the baseline. The performance of these models is reported in terms of recall, false positive rates, F-measure and AUC and statistically compared using Scott-Knott ESD tests. Community smells improve the prediction performance of a baseline model by up to 3% in terms of AUC, while code smell intensity improves the baseline models by up to 40% in terms of F-measure and up to 17% in terms of AUC. The conclusions are significantly influenced by the validation strategies used, algorithms and the selected projects' data characteristics. While the code smell intensity metric captures the most information about technical flaws in predicting bug-prone classes, the community smells also contribute to bug prediction models by revealing communication and collaboration flaws in software development teams. Future research is needed to capture the communication patterns through multiple channels and to understand whether socio-technical flaws could be used in a cross-project bug prediction setting.