Fuzzy-Provenance Architecture for Effort Metric Data Quality Assessment

  • Rita Cristina Galarraga Berardi PUCRS
  • Duncan Dubugras Alcoba Ruiz PUCRS


Software companies rely on stored metric data in order to track and manage their projects, through analyzing, monitoring and estimating software metrics. If managers cannot believe the metrics data, the product that is being developed is fated to fail. Currently, the assessment of software effort is subjective and derived mainly through managers’ assumptions, which is fundamentally an error-prone process. We present an architecture for assessing data quality of software effort metric based on data provenance associated with a mechanism of logical inference (fuzzy logic). The contribution is to provide an assessment to search evident reasons for a low quality in order to ensure that the metrics can be used with sufficient reliability.
Palavras-chave: Fuzzy-Provenance Architecture, Data Quality, Fuzzy Logic


Altrock, C. V. (1995) Fuzzy Logic & NeuroFuzzy Applications Explained, Prentice Hall, p.342

Batini, C., Barone, D., Mastrella, M., Maurino, A., Ruffini, C., (2007) A Framework and a Methodology for data quality assessment and monitoring. Proceedings of the Twelfth International Conference on Information Quality (ICIQ-07) MIT.

Becker K., Ruiz, D., Cunha, V., Novello, T., and Souza, F., (2006) Spdw: A software development process performance data warehousing environment. In SEW '06: Proceedings of the 30th Annual IEEE/NASA Software Engineering Workshop, pages 107-118, Washington, DC, USA. IEEE Computer Society.

Berry, M., Jeffery, R., Aurum (2004) Assessment of Software Measurement: an Information Quality Study. Proceedings of the 10th International Symposium on Software Metrics.Pp 314- 325

Buneman, P., Chapman, A., Cheney, J. (2006) Provenance management in curated databases. In SIGMOD '06: Proceedings of the 2006 ACM SIGMOD International Conference on Management of data, pages 539-550, New York, NY, USA

Buneman, P., Khanna, S., Tan, Wang-Chiew. (2001) Why and where: Acharacterization of data provenance. Lecture Notes in Computer Science.

Caballero, I. and Verbo, E. (2007) A Data Quality Measurement Information ModelBased on ISO/IEC 15939. Proceedings of the Twelfth International Conference on Information Quality (ICIQ-07) MIT, 2007.

Caro, A. (2007) A Probabilistic Approach to Web Portal’s Data Quality Evaluation. Proceedings of the Sixth International Conference Information on the Quality of Information and Communications Technology (QUATIC 2007), publisher IEEE Computer Society

Fileto, R., Medeiros, C.B., Liu, L., Pu, C., Assad, E. D. (2003) Using domain ontologies to help track data provenance. SBBD, pp. 84_98

Foster, I., Voeckler, J., Wilde, M., Zhao, Y. (2002) Chimera: A virtual data system for representing, querying, and automating data derivation. Proceedings of the 14th International Conference on Scientific and Statistical Database Management. Pages: 37 - 46.

IEEE Computer Society (1998) IEEE Standard for Software Quality Metrics Methodology. The Institute of Electrical and Electronics Engineers, Inc., New York, December.

Lee, Y. W., Pipino, Leo L., Funk, James D., Wang, Richard Y. (2006) Journey to Data Quality, The MIT Press, Cambridge, Massachusetts, London, p.226

Lee, Y. W., Pipino, L., Strong, D. M., Wang, Richard Y. (2004) Process-embedded data integrity, Journal of Database Management, 15(1), pp 87-103.

Lee, Y. W., Strong, D. M., Kahn, B. K., and R. Y. Wang. (2002) AIMQ: A Methodology for Information Quality Assessment. Information and Management 40 (2): 133-146.

Liebchen, G., Twala, B., Shepperd, M., (2007). Filtering, Robust Filtering, Polishing: Techniques for Addressing Quality in Software Data Empirical Software Engineering and Measurement, ESEM. First International Symposium on Empirical Software Engineering and Measurement.

Marconi, M., Lakatos, E., 2003 Fundamentos de Metodologia Científica. Editora Atlas S.A. 5 edição.

Pipino, L. L, Lee, Y. W., Wang, R. Y. (2002) Data Quality Assessment, Communications of the ACM, 45 (4), pp211

Prat, N., and Madnick, S. E. (2008) Measuring Data Believability: A Provenance Approach, Working papers 40086, Massachusetts Institute of Technology (MIT), Sloan School of Management.

Redman, T. C. (2001) Data Quality – The Field Guide. Digital Press, p. 241 SEI (Software Engineering Institute) (2006) “CMM for Development”, Version 1.2.

Carnegie Mellon, Pittsburgh. Wand, Y. and Wang, R.Y. (1996) Anchoring Data Quality Dimensions in Ontological Foundation. Communications of the ACM 39 (11):86-95.

Widom, J. (2005) Trio: A system for integrated management of data, accuracy, and lineage. CIDR, pp 262-276

Witten, Ian H. And Frank, E., Data mining: Practical Machine Learning Tools and Techniques. Elsevier, San Francisco, Ca, 2005.
Como Citar

Selecione um Formato
BERARDI, Rita Cristina Galarraga; RUIZ, Duncan Dubugras Alcoba. Fuzzy-Provenance Architecture for Effort Metric Data Quality Assessment. In: SIMPÓSIO BRASILEIRO DE QUALIDADE DE SOFTWARE (SBQS), 8. , 2009, Ouro Preto. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2009 . p. 1-15. DOI: https://doi.org/10.5753/sbqs.2009.15500.