Data collection and visualization of researchers' scientific production

Abstract


Data that reflect the scientific production of researchers have an inestimable value for several applications. Several repositories index articles and make them available for consultation, such as DBLP, Research Gate, and Google Scholar. Although the data are available in several public repositories, this data's collection and local persistence can benefit specific applications. This article presents a proposal for a data collector from three public repositories: DBLP, Research Gate, and Google Scholar, and their subsequent persistence in a relational database. In addition, a visualization interface for the collected data is also presented.
Keywords: Crawlers, scientific production, data extraction

References

Balog, K., Fang, Y., de Rijke, M., Serdyukov, P., and Si, L. (2012). Expertise retrieval. Foundations and Trends® in Information Retrieval, 6(2–3):127–256.

Brandao, M. A. and Moro, M. M. (2017). The strength of co-authorship ties through different topological properties. Journal of the Brazilian Computer Society, 23(1):5.

Farber, M. (2020). Analyzing the github repositories of research papers. In Proceedings of the ACM/IEEE Joint Conference on Digital Libraries in 2020, JCDL ’20, page 491–492, New York, NY, USA. Association for Computing Machinery.

Ferreira, A. A., Gonçalves, M. A., and Laender, A. H. (2012). A brief survey of automatic methods for author name disambiguation. ACM Sigmod Record, 41(2):15–26.

Kumar, S. (2015). Co-authorship networks: A review of the literature. Aslib Journal of Information Management, 67:55–73.

Lopes, G., Moro, M., Wives, L., and Palazzo Moreira de Oliveira, J. (2010). Cooperative authorship social network. volume 619.

Malisart, A. (2009). Researcher profile: A web 2.0 application for visualising research communities. In Proceedings of the Joint International and Annual ERCIM Workshops on Principles of Software Evolution (IWPSE) and Software Evolution (Evol) Workshops, IWPSE-Evol ’09, page 145–152, New York, NY, USA. Association for Computing Machinery.

Pereira, D. A., Ribeiro-Neto, B., Ziviani, N., Laender, A. H., Gonc¸alves, M. A., and Ferreira, A. A. (2009). Using web information for author name disambiguation. In Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries, pages 49–58.
Published
2021-10-04
BRANCO, Arthur M.; DORNELES, Carina F.. Data collection and visualization of researchers' scientific production. In: DATASET SHOWCASE WORKSHOP (DSW), 3. , 2021, Rio de Janeiro. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2021 . p. 101-106. DOI: https://doi.org/10.5753/dsw.2021.17418.