Fuzzy-Provenance Architecture for Effort Metric Data Quality Assessment
ResumoSoftware companies rely on stored metric data in order to track and manage their projects, through analyzing, monitoring and estimating software metrics. If managers cannot believe the metrics data, the product that is being developed is fated to fail. Currently, the assessment of software effort is subjective and derived mainly through managers’ assumptions, which is fundamentally an error-prone process. We present an architecture for assessing data quality of software effort metric based on data provenance associated with a mechanism of logical inference (fuzzy logic). The contribution is to provide an assessment to search evident reasons for a low quality in order to ensure that the metrics can be used with sufficient reliability.
Batini, C., Barone, D., Mastrella, M., Maurino, A., Ruffini, C., (2007) A Framework and a Methodology for data quality assessment and monitoring. Proceedings of the Twelfth International Conference on Information Quality (ICIQ-07) MIT.
Becker K., Ruiz, D., Cunha, V., Novello, T., and Souza, F., (2006) Spdw: A software development process performance data warehousing environment. In SEW '06: Proceedings of the 30th Annual IEEE/NASA Software Engineering Workshop, pages 107-118, Washington, DC, USA. IEEE Computer Society.
Berry, M., Jeffery, R., Aurum (2004) Assessment of Software Measurement: an Information Quality Study. Proceedings of the 10th International Symposium on Software Metrics.Pp 314- 325
Buneman, P., Chapman, A., Cheney, J. (2006) Provenance management in curated databases. In SIGMOD '06: Proceedings of the 2006 ACM SIGMOD International Conference on Management of data, pages 539-550, New York, NY, USA
Buneman, P., Khanna, S., Tan, Wang-Chiew. (2001) Why and where: Acharacterization of data provenance. Lecture Notes in Computer Science.
Caballero, I. and Verbo, E. (2007) A Data Quality Measurement Information ModelBased on ISO/IEC 15939. Proceedings of the Twelfth International Conference on Information Quality (ICIQ-07) MIT, 2007.
Caro, A. (2007) A Probabilistic Approach to Web Portal’s Data Quality Evaluation. Proceedings of the Sixth International Conference Information on the Quality of Information and Communications Technology (QUATIC 2007), publisher IEEE Computer Society
Fileto, R., Medeiros, C.B., Liu, L., Pu, C., Assad, E. D. (2003) Using domain ontologies to help track data provenance. SBBD, pp. 84_98
Foster, I., Voeckler, J., Wilde, M., Zhao, Y. (2002) Chimera: A virtual data system for representing, querying, and automating data derivation. Proceedings of the 14th International Conference on Scientific and Statistical Database Management. Pages: 37 - 46.
IEEE Computer Society (1998) IEEE Standard for Software Quality Metrics Methodology. The Institute of Electrical and Electronics Engineers, Inc., New York, December.
Lee, Y. W., Pipino, Leo L., Funk, James D., Wang, Richard Y. (2006) Journey to Data Quality, The MIT Press, Cambridge, Massachusetts, London, p.226
Lee, Y. W., Pipino, L., Strong, D. M., Wang, Richard Y. (2004) Process-embedded data integrity, Journal of Database Management, 15(1), pp 87-103.
Lee, Y. W., Strong, D. M., Kahn, B. K., and R. Y. Wang. (2002) AIMQ: A Methodology for Information Quality Assessment. Information and Management 40 (2): 133-146.
Liebchen, G., Twala, B., Shepperd, M., (2007). Filtering, Robust Filtering, Polishing: Techniques for Addressing Quality in Software Data Empirical Software Engineering and Measurement, ESEM. First International Symposium on Empirical Software Engineering and Measurement.
Marconi, M., Lakatos, E., 2003 Fundamentos de Metodologia Científica. Editora Atlas S.A. 5 edição.
Pipino, L. L, Lee, Y. W., Wang, R. Y. (2002) Data Quality Assessment, Communications of the ACM, 45 (4), pp211
Prat, N., and Madnick, S. E. (2008) Measuring Data Believability: A Provenance Approach, Working papers 40086, Massachusetts Institute of Technology (MIT), Sloan School of Management.
Redman, T. C. (2001) Data Quality – The Field Guide. Digital Press, p. 241 SEI (Software Engineering Institute) (2006) “CMM for Development”, Version 1.2.
Carnegie Mellon, Pittsburgh. Wand, Y. and Wang, R.Y. (1996) Anchoring Data Quality Dimensions in Ontological Foundation. Communications of the ACM 39 (11):86-95.
Widom, J. (2005) Trio: A system for integrated management of data, accuracy, and lineage. CIDR, pp 262-276
Witten, Ian H. And Frank, E., Data mining: Practical Machine Learning Tools and Techniques. Elsevier, San Francisco, Ca, 2005.