Improvement of effort estimation accuracy in software projects using a feature selection approach
Subject Areas : Data MiningZahra Shahpar 1 , Vahid Khatibi 2 , Asma Tanavar 3 , Rahil Sarikhani 4
1 - Department of Computer Engineering, Kerman Branch, Islamic Azad University, Kerman,Iran.
2 - Faculty Member of Islamic Azad University, Kerman Branch, Kerman,Iran.
3 - Department of Computer, Kerman Branch, Islamic Azad University
4 - Department of Computer, Kerman Branch, Islamic Azad University, Iran
Keywords: dimensionality reduction, Feature Selection, Genetic algorithm, software effort estimation,
Abstract :
In recent years, utilization of feature selection techniques has become an essential requirement for processing and model construction in different scientific areas. In the field of software project effort estimation, the need to apply dimensionality reduction and feature selection methods has become an inevitable demand. The high volumes of data, costs, and time necessary for gathering data , and also the complexity of the models used for effort estimation are all reasons to use the methods mentioned. Therefore, in this article, a genetic algorithm has been used for feature selection in the field of software project effort estimation. This technique has been tested on well-known data sets. Implementation results indicate that the resulting subset, compared to the original data set, has produced better outcomes in terms of effort estimation accuracy. This article showed that genetic algorithms are ideal methods for selecting a subset of features and improving effort estimation accuracy.
[1] Y. S. Seo, et al, “AREION: Software effort estimation based on multiple regressions with adaptive recursive data partitioning”, ELSEVIER, Information and Software Technology, vol.55, pp. 1710-7725, 2013.
[2] A. S. Grewal, et al, “Emerging Estimation Techniques”, International Journal of Computer Applications (0975 – 8887), vol. 52, no. 8, pp. 30–34, 2012.
[3] Hatami, Nafiseh, “Examination of Feature Selection Based Methods”, ict center, Malek-Ashtar University of Technology, 2013.
[4] E. Papatheocharous, et al, “Feature Subset Selection for Software Cost Modelling and Estimation”, 2010.
[5] V. Khatibi, et al, “Increasing the Accuracy of Analogy Based Software Development Effort Estimation Using Neural Networks”, International Journal of Computer and Communication Engineering, Vol. 2, No. 1, pp. 78-81, 2013.
[6] A. Zaid, et al, “Issues in Software Cost Estimation,” International Journal of Computer Science and Network Security, vol. 8, no. 11, pp. 350–356, 2008.
[7] F. Ferrucci, et al, “Genetic Programming for Effort Estimation:An Analysis of the Impact of Different Fitness Functions”, 2nd International Symposium on Search Based Software Engineering, PP. 89-91, IEEE, 2010.
[8] M. Singh, et al, “Software Productivity Empirical Model for Early Estimation of Development”, International Journal of Computer Science and Information Technologies, Vol. 5 (1), 2014, pp. 682-685, 2014.
[9] M. Azzeh, et al, “An Optimized Analogy-Based Project Effort Estimation”, International Journal of Advanced Computer Science and Applications, Vol.5, no.4, pp. 6-12, 2014.
[10] R. p, et al, “Building Software Cost Estimation Models using Homogenous Data”, IEEE, First International Symposium on Empirical Software Engineering and Measurement, PP.393-400, 2007.
[11] M. Karagiannopoulos, et al. “Feature Selection for Regression Problems”, Educational Software Development Laboratory, Department of Mathematics, University of Patras, Greece, 2004.
[12] H. Hamza, et al, “Software Effort Estimation using Artificial Neural Networks: A Survey of the Current Practices”, IEEE, 10th International Conference on Information Technology: New Generations, PP.731-733, 2013.
[13] M. Melanie, an Introduction to Genetic Algorithms, Cambridge, Massachusetts. London, England, Fifth printing, 1999.
[14] J. Han and M. Kamber, Data Mining:Concepts and Techniques, Second Edition, Elsevier, University of Illinois at Urbana-Champaign, 2006.
[15] M. O. Elish, et al, “Empirical Study of Homogeneous and Heterogeneous Ensemble Models for Software Development Effort Estimation”, Hindawi Publishing Corporation Mathematical Problems in Engineering,Vol 2013, Article ID 312067, 21 pages, http://dx.doi.org/10.1155/2013/312067, 2013.
[16] V. Khatibi, et al, “A New Fuzzy Clustering Based Method to Increase the Accuracy of Software Development Effort Estimation”, World Applied Sciences Journal, vol.14 (9), pp.265-1275, 2011.