Coupling for Coreference Resolution in a Never-ending Learning System

Authors

  • F. Quécole Universidade Federal de São Carlos
  • M.C. Duarte Bradesco Bank S.A.
  • E.R. Hruschka Universidade Federal de São Carlos

DOI:

https://doi.org/10.5753/jidm.2018.2048

Keywords:

coreference, coupling, machine learning, never-ending learning, ensemble

Abstract

The Never-Ending Language Learning (NELL) is a system that attempts learning to learn from the Web every day, in an autonomous way. Maintaining high precision is the key to keeping the NELL’s learning active and improving day-by-day. One of the challenges for NELL system is to properly identify different noun phrases that denote the same concept in order to maintain the cohesion of the knowledge base. This article investigates the coupling as an approach for improving coreference resolution on NELL. For that, several coupled algorithms, and simple ensemble methods, considering semantic and morphologic features were compared with results previously obtained with no use of coupling. The results presented in this article confirm empirically that coupling strategy is a useful and good approach to achieve better coverage and accuracy in NELL’s knowledge base.

Downloads

Download data is not yet available.

References

Carlson, A., Betteridge, J., Hruschka, Jr., E. R., and Mitchell, T. M. Coupling semi-supervised learning of categories and relations. In Proceedings of the NAACL HLT 2009 Workshop on Semi-Supervised Learning for Natural Language Processing. SemiSupLearn ’09. Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 1–9, 2009.

Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka, Jr., E. R., and Mitchell, T. M. Toward an architecture for never-ending language learning. In Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence. AAAI’10. AAAI Press, pp. 1306–1313, 2010.

Carpenter, B. Lingpipe for 99.99% recall of gene mentions. In Proceedings of the Second BioCreative Challenge Evaluation Workshop. Vol. 23. -, pp. 307–309, 2007.

Chen, X., Shrivastava, A., and Gupta, A. Neil: Extracting visual knowledge from web data. In Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE, pp. 1409–1416, 2013.

Covington, M. A. An algorithm to align words for historical comparison. Computational linguistics 22 (4): 481–496, 1996.

Curran, J. R., Murphy, T., and Scholz, B. Minimising semantic drift with mutual exclusion bootstrapping. In Proceedings of the 10th Conference of the Pacific Association for Computational Linguistics. Vol. 6. pp. 172–180, 2007.

Dietterich, T. G. Approximate statistical tests for comparing supervised classification learning algorithms. Neural computation 10 (7): 1895–1923, 1998.

Dietterich, T. G. Ensemble learning. The handbook of brain theory and neural networks vol. 2, pp. 110–125, 2002.

Duarte, M. C. et al. Leitura da web em português em ambiente de aprendizado sem-fim, 2016.

Duarte, M. C. and Hruschka, E. R. Exploring two views of coreference resolution in a never-ending learning system. In Hybrid Intelligent Systems (HIS), 2014 14th International Conference on. IEEE, pp. 273–278, 2014.

Hruschka Jr, E. R., Duarte, M. C., and Nicoletti, M. C. Coupling as strategy for reducing concept-drift in never-ending learning environments. Fundamenta Informaticae 124 (1-2): 47–61, 2013.

Krishnamurthy, J. and Mitchell, T. M. Which noun phrases denote which concepts? In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1. HLT ’11. Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 570–580, 2011.

Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., Krishnamurthy, J., Lao, N., Mazaitis, K., Mohamed, T., Nakashole, N., Platanios, E., Ritter, A., Samadi, M., Settles, B., Wang, R., Wijaya, D., Gupta, A., Chen, X., Saparov, A., Greaves, M., and Welling, J. Never-ending learning. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence. AAAI’15. AAAI Press, pp. 302–2310, 2015.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. Scikit-learn: Machine learning in python. Journal of Machine Learning Research 12 (Oct): 2825–2830, 2011.

Samadi, M., Veloso, M. M., and Blum, M. Openeval: Web information query evaluation., 2013.

Wolpert, D. H. Stacked generalization. Neural networks 5 (2): 241–259, 1992.

Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., and Soderland, S. Textrunner: Open information extraction on the web. In Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations. NAACL-Demonstrations ’07. Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 25–26, 2007.

Zhu, X. pp. 892–897. In C. Sammut and G. I. Webb (Eds.), Semi-Supervised Learning. Springer US, Boston, MA, pp. 892–897, 2010.

Downloads

Published

2018-10-01

How to Cite

Quécole, F., Duarte, M., & Hruschka, E. (2018). Coupling for Coreference Resolution in a Never-ending Learning System. Journal of Information and Data Management, 9(2), 124. https://doi.org/10.5753/jidm.2018.2048

Issue

Section

KDMILE 2017