Identifying named entity from researcher curricula
ResumoNER (Named Entity Recognition) is an essential task in recognizing real-world entities scattered in a document. The task has been beneficial for detecting people, institutions, and places. In a researcher's curriculum repository, a NER process can be beneficial for understanding the associated context of a given document. For example, it could be possible to identify which persons/institutions are present in a given researcher's curriculum. This process is fundamental to identifying experts to work on a project or collaboration among researchers. In this paper, we evaluate entity extraction methods' effectiveness for identifying entities from scientific publications, including vocabulary-based and model-based methods. We describe an analysis of existing NER tools while proposing a procedure to apply NER identification over curricula from the Brazilian Lattes Curricula platform.
Jurafsky, D. and Martin, J. H. (2018). Speech and Language Processing (2rd Edition-draft). Upper Saddle River, NJ, USA.
Nadeau, D. and Sekine, S. (2007). A survey of named entity recognition and classification. Lingvisticae Investigationes, 30(1):3-26.
Yadav, V. and Bethard, S. (2018). A survey on recent advances in named entity recognition from deep learning models. In Proc. of the 27th International Conf. on Comput. Linguistics.