On Software Fault Prediction by Mining Software Complexity Data with Dynamically Filtered Training Sets


Podgorelec V.

9th WSEAS International Conference on Simulation, Modelling and Optimization, Budapest, Hungary, 3 - 05 September 2009, pp.332-334 identifier

  • Publication Type: Conference Paper / Full Text
  • City: Budapest
  • Country: Hungary
  • Page Numbers: pp.332-334

Abstract

Software fault prediction methods are very appropriate for improving the software reliability. With the creation of large empirical databases of software projects, as a result of stimulated research on estimation models, metrics and methods for measuring and improving processes and products, intelligent mining of these datasets can largely add to the improvement of software reliability. In the paper we present a study on using decision tree classifiers for predicting software faults. A new training set filtering method is presented that should improve the classification performance when mining the software complexity measures data. The classification improvement should be achieved by removing the identified outliers from a training set. We argue that a classifier trained by a filtered dataset captures a more general knowledge model and should therefore perform better also on unseen cases. The proposed method is applied on a real-world software reliability analysis dataset and the obtained results are discussed.