Graph Algorithms for Word Sense Disambiguation in Biomedicine
Resumo
Word Sense Disambiguation (WSD) is an important task for Biomedicine text-mining. Supervised WSD methods have the best results but they are complex and their cost for testing is too high. This work presents an experiment on WSD using graph-based approaches (unsupervised methods). Three algorithms were tested and compared to the state of the art. Results indicate that similar performance could be reached with different levels of complexity, what may point to a new approach to this problem.
Referências
Agirre, E., Soroa, A., & Stevenson, M. (2010). Graph-based Word Sense Disambiguation of biomedical documents. Bioinformatics, 26(22), 2889–2896.
Borgatti, S. P. (2006). Identifying sets of key players in a social network. Computational and Mathematical Organization Theory, 12(1), 21–34.
Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1-7), 107–117.
Garla, V. N., & Brandt, C. (2012). Knowledge-based biomedical word sense disambiguation: an evaluation and application to clinical document classification. Journal of the American Medical Informatics Association : JAMIA, 20(5), 882–6.
Haveliwala, T. H. (2002). Topic-sensitive PageRank. Proceedings of the Eleventh International Conference on WWW ’02, 517.
Humphrey, S., & Rogers, W. (2006). Word sense disambiguation by selecting the best semantic type based on Journal Descriptor Indexing: Preliminary experiment. Journal of the American Medical Informatics Association, 57(1), 96–113.
Humphreys, B. L., Lindberg, D. A. B., Schoolman, H. M., & Barnett, G. O. (1998). The Unified Medical Language System: An Informatics Research Collaboration. Journal of the American Medical Informatics Association, 5(1), 1–11.
Kleinberg, J. M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 604–632.
McInnes, B. (2008). An unsupervised vector approach to biomedical term disambiguation: integrating UMLS and Medline. Proceedings of HLT-SRWS 2008, (June), 49–54.
McInnes, B. T., & Pedersen, T. (2013). Evaluating measures of semantic similarity and relatedness to disambiguate terms in biomedical text. Journal of Biomedical Informatics, 46(6), 1116–1124.
Miller, G. a. (1995). WordNet: a lexical database for English. Communications of the ACM, 38(11), 39–41.
Navigli, R. (2009). Word sense disambiguation. ACM Computing Surveys, 41(2), 1–69.
Navigli, R. (2012). A quick tour of word sense disambiguation, induction and related approaches. Lecture Notes in Computer Science (Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 7147 LNCS, 115–129.
Navigli, R., & Lapata, M. (2007). Graph Connectivity Measures for Unsupervised Word Sense Disambiguation. Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI), 1683–1688.
Navigli, R., & Lapata, M. (2010). An experimental study of graph connectivity for unsupervised word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(4), 678–92.
Page, L., Brin, S., Motwani, R., & Winograd, T. (1998). The PageRank Citation Ranking: Bringing Order to the Web.
Preiss, J., & Stevenson, M. (2013). DALE: A Word Sense Disambiguation System for Biomedical Documents Trained using Automatically Labeled Examples. In HLTNAACL (pp. 1–4).
Trivedi, M., Sharma, S., & Deulkar, K. (2014). Approaches To Word Sense Disambiguation. International Journal of Engineering Research & Technology, 3(10), 645–647.
Weeber, M., Mork, J. G., & Aronson, a R. (2001). Developing a test collection for biomedical word sense disambiguation. Proceedings Annual Symposium. AMIA Symposium, 746–50.