CiênciaBrasil - The Brazilian Portal of Science and Technology
Resumo
Research social networks are a potentially useful resource for studying science and technology indicators from specific communities (e.g., a country). However, building and analyzing such networks beget challenges beyond those from regular social networks, since data about actors and their relationships are usually dispersed across various sources. In this paper, we present a research social network built from an individual perspective by gathering data from a Brazilian curricula vitae repository. We describe its architecture and the solutions adopted for data collection, extraction and deduplication, and for materializing and visualizing the network.
Referências
Borges, E. N., Carvalho, M. G., Galante, R., Gonçalves, M. A., and Laender, A. H. F. (2011). An Unsupervised Heuristic-based Approach for Bibliographic Metadata Deduplication. Information Processing & Management, Accepted for publication.
Carvalho, M. G., Laender, A. H. F., Gonçalves, M. A., and da Silva, A. S. (2008). Replica identification using genetic programming. In Procs. of SAC - ACM Symposium on Applied Computing, pages 1801–1806, Fortaleza, Brazil.
Cortez, E., da Silva, A. S., Gonçalves, M. A., and de Moura, E. S. (2010). ONDUX: on-demand unsupervised learning for information extraction. In Procs. of SIGMOD Conference, pages 807–818, Indianapolis, USA.
Geer, D. (2008). Reducing the Storage Burden via Data Deduplication. Computer, 41(12):15 –17.
Heydon, A. and Najork, M. (1999). Mercator: A Scalable, Extensible Web Crawler. World Wide Web, 2(4):219–229.
Koutrika, G., Zadeh, Z. M., and Garcia-Molina, H. (2009). Data clouds: summarizing keyword search results over structured data. In Procs. of EDBT - Intl. Conf. on Extending Database Technology, pages 391–402, Saint-Petersburg, Russia.
Laender et.al, A. H. F. (2008). Assessing the research and education quality of the top Brazilian Computer Science graduate programs. SIGCSE Bulletin, 40(2):135–145.
Lane, J. (2010). Let’s make science metrics more scientific. Nature, 464(7288):488–489.
Lopes, G. R., Moro, M. M., Wives, L. K., and de Oliveira, J. P. M. (2010). Cooperative Authorship Social Network. In Procs. of AMW - Alberto Mendelzon Workshop on Foundations of Databases, Buenos Aires, Argentina.
Menezes, G. V., Ziviani, N., Laender, A. H. F., and Almeida, V. A. F. (2009). A Geographical Analysis of Knowledge Production in Computer Science. In Procs. of WWW International World Wide Web Conference, pages 1041–1050, Madrid, Spain.
Nie, Z., Zhang, Y., Wen, J.-R., and Ma, W.-Y. (2005). Object-level ranking: bringing order to Web objects. In Procs. of WWW - International World Wide Web Conference, pages 567–574, Chiba, Japan.
Sarawagi, S. and Bhamidipaty, A. (2002). Interactive deduplication using active learning. In Procs. of KDD - ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 269–278, Edmonton, Canada.
Tang et.al, J. (2008). ArnetMiner: Extraction and Mining of Academic Social Networks. In Procs. of KDD - ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 990–998, Las Vegas, USA.
Wang et.al, C. (2010). Mining advisor-advisee relationships from research publication networks. In Procs. of KDD - ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 203–212, Washington, DC.