Summary-based Comparison of Data Quality across Public MAGE-ML Genomic Datasets

Authors

  • Lorena Etcheverry Instituto de Computación, Facultad de Ingeniería, Universidad de la República
  • Mariano P. Consens University of Toronto

DOI:

https://doi.org/10.5753/jidm.2011.1379

Keywords:

XML, data quality, mage-ml, functional genomic data standards and public collections, schema evolution

Abstract

Extensive microarray experimental data is available online, facilitating independent evaluation of experiment
conclusions and enabling reuse. Numerous microarray experiment datasets are published using the MAGE-ML
XML schema but assessing the quality of published experiments still represents a challenging task since there is no
consensus among microarray users on a framework to measure datasets quality.
In this paper, we apply techniques based on DescribeX that quantitatively and qualitatively analyze MAGE-ML
public collections, gaining insights about schema evolution. Our case study shows that DescribeX is a useful tool for
the evaluation of microarray experiment data quality that enhances the understanding of the instance-level structure of
MAGE-ML datasets and its evolution.

Downloads

Download data is not yet available.

Author Biography

Lorena Etcheverry, Instituto de Computación, Facultad de Ingeniería, Universidad de la República

Teacher assistant, Instituto de Computación

Downloads

Published

2011-08-12

How to Cite

Etcheverry, L., & Consens, M. P. (2011). Summary-based Comparison of Data Quality across Public MAGE-ML Genomic Datasets. Journal of Information and Data Management, 2(1), 3. https://doi.org/10.5753/jidm.2011.1379

Issue

Section

SBBD 2010 Short Papers