Big Data Architectures for FAIR-compliant Repositories: A Systematic Review


The FAIR Principles state that scientific data should be Findable, Accessible, Interoperable, and Reusable in order to adhere to the Open Science movement. However, designing a FAIR-compliant repository can be a challenge due to the complexity of managing a huge volume and variety of research data and metadata, which can also be generated at a high velocity. This complexity calls for a Software Reference Architecture (SRA) to guide data engineers during the implementation process. In this paper, we conduct a systematic review that encompasses research efforts regarding architectural solutions for implementing FAIR-compliant repositories. We analyze 323 references from Scopus, ACM, IEEEXplore, and specialists recommendations. From this analysis, we discover 7 studies that describe general purpose big data SRAs, 13 pipelines that implement the FAIR Principles to specific contexts, and 3 FAIR-compliant big data SRAs. We describe their key characteristics and discuss their limitations, highlighting tendencies and research opportunities.

Palavras-chave: Open Science, FAIR Principles, Big Data, Software Reference Architecture, SRA


