Metrics for Schema Analysis in Document-Oriented NoSQL Databases

Abstract


The data schema is crucial in software development, impacting the final product. Finding the ideal schema is challenging due to the numerous alternatives. Metrics have been proposed to determine the ideal schema but focus on specific aspects, such as query or schema evaluation, without addressing both simultaneously. This article proposes a metric that considers queries and schemas, including subschemas. Relationships and attributes are also considered, with weighting coefficients for each. The results show that the metric can identify complex schemas, assigning them higher scores, while simpler schemas receive lower scores.
Keywords: data modeling, NoSQL, documents, metrics

References

Chen, L., Davoudian, A., and Liu, M. (2022). A workload-driven method for designing aggregate-oriented nosql databases. Data & Knowledge Engineering, 142:102089.

Gómez, P., Casallas, R., and Roncancio, C. (2016). Data schema does matter, even in nosql systems! In 2016 IEEE Tenth International Conference on Research Challenges in Information Science (RCIS), pages 1–6. IEEE.

Gómez, P., Roncancio, C., and Casallas, R. (2018). Towards quality analysis for document oriented bases. In International Conference on Conceptual Modeling, pages 200–216. Springer.

Gómez, P., Roncancio, C., and Casallas, R. (2021). Analysis and evaluation of document-oriented structures. Data & Knowledge Engineering, 134:101893.

Imam, A. A., Basri, S., Ahmad, R., Wahab, A. A., González-Aparicio, M. T., Capretz, L. F., Alazzawi, A. K., and Balogun, A. O. (2020). Dsp: Schema design for non-relational applications. Symmetry, 12(11):1799.

Kuszera, E. M., Peres, L. M., and Didonet Del Fabro, M. (2020). Query-based metrics for evaluating and comparing document schemas. In International Conference on Advanced Information Systems Engineering, pages 530–545. Springer.

Mior, M. J., Salem, K., Aboulnaga, A., and Liu, R. (2017). Nose: Schema design for nosql applications. IEEE Transactions on Knowledge and Data Engineering, 29(10):2275–2289.

Moody, D. L. (2005). Theoretical and practical issues in evaluating the quality of conceptual models: current state and future directions. Data & Knowledge Engineering, 55(3):243–276.

Reis, D. G., Gasparoni, F. S., Holanda, M., Victorino, M., Ladeira, M., and Ribeiro, E. O. (2018). An evaluation of data model for nosql document-based databases. In World Conference on Information Systems and Technologies, pages 616–625. Springer.

Reniers, V., Van Landuyt, D., Rafique, A., and Joosen, W. (2020). A workload-driven document database schema recommender (dbsr). In International Conference on Conceptual Modeling, pages 471–484. Springer.

Vera-Olivera, H., Alvarez-Mamani, E., and Holanda, M. (2023). Análise de desempenho em banco de dados nosql orientado a documentos: Um Índice para comparação de modelos de dados. In Anais do XXXVIII Simpósio Brasileiro de Bancos de Dados, pages 26–38, Porto Alegre, RS, Brasil. SBC.
Published
2024-10-14
VERA-OLIVERA, Harley; HOLANDA, Maristela. Metrics for Schema Analysis in Document-Oriented NoSQL Databases. In: BRAZILIAN SYMPOSIUM ON DATABASES (SBBD), 39. , 2024, Florianópolis/SC. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2024 . p. 381-393. ISSN 2763-8979. DOI: https://doi.org/10.5753/sbbd.2024.240646.