An industrial case study of classifier ensembles for locating software defects

Misirli, Ayşe; Bener, Ayse; Turhan, Burak

doi:10.1007/s11219-010-9128-1

An industrial case study of classifier ensembles for locating software defects

Atıf İçin Kopyala

Misirli A., Bener A. B., Turhan B.

SOFTWARE QUALITY JOURNAL, cilt.19, sa.3, ss.515-536, 2011 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 19 Sayı: 3
Basım Tarihi: 2011
Doi Numarası: 10.1007/s11219-010-9128-1
Dergi Adı: SOFTWARE QUALITY JOURNAL
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Sayfa Sayıları: ss.515-536
İstanbul Teknik Üniversitesi Adresli: Hayır

Özet

As the application layer in embedded systems dominates over the hardware, ensuring software quality becomes a real challenge. Software testing is the most time-consuming and costly project phase, specifically in the embedded software domain. Misclassifying a safe code as defective increases the cost of projects, and hence leads to low margins. In this research, we present a defect prediction model based on an ensemble of classifiers. We have collaborated with an industrial partner from the embedded systems domain. We use our generic defect prediction models with data coming from embedded projects. The embedded systems domain is similar to mission critical software so that the goal is to catch as many defects as possible. Therefore, the expectation from a predictor is to get very high probability of detection (pd). On the other hand, most embedded systems in practice are commercial products, and companies would like to lower their costs to remain competitive in their market by keeping their false alarm (pf) rates as low as possible and improving their precision rates. In our experiments, we used data collected from our industry partners as well as publicly available data. Our results reveal that ensemble of classifiers significantly decreases pf down to 15% while increasing precision by 43% and hence, keeping balance rates at 74%. The cost-benefit analysis of the proposed model shows that it is enough to inspect 23% of the code on local datasets to detect around 70% of defects.