Skip to main content

Joint Event Extraction with Contextualized Word Embeddings for the Portuguese Language

  • Conference paper
  • First Online:
Intelligent Systems (BRACIS 2021)

Abstract

Event Extraction (EE) is the task of identifying mentions of particular event types and their arguments in text, and it constitutes an important and challenging task within the area of Information Extraction (IE). However, in the context of the Portuguese language, very little work has been conducted on this topic. In this paper, we propose a neural-based method for EE, as well as a data resource to mitigate this research gap. We also present a data augmentation strategy for EE, employing an Open Information Extraction (OIE) system, aiming to overcome the shortage in annotated data for the problem in the Portuguese language. Our experimental results show that our method is able to predict event types and arguments automatically, and the proposed method of data augmentation, in one of the two evaluated samples, contributes to the performance of the tested models in the subtask of argument role prediction. Further, an implementation of our method is available to the community, as the models trained in our experiments (https://github.com/FORMAS/TEFE).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    s1: “... the Arabs would have to support Iraq in a fight against their common Israeli enemy.”.

  2. 2.

    s2: “Both companies are allies of Navigation Mixte in its fight against a hostile takeover bid ...”.

  3. 3.

    “said”, “referred”, “announced”, “added”.

  4. 4.

    “Meridian National Corp. said it sold 750,000 shares of its common stock to the McAlpine family interests, for $1 million, or $1.35 a share.”

  5. 5.

    https://github.com/kraiyani/Automated-Event-Extraction-Model-for-Multiple-Linked-Portuguese-Documents.

  6. 6.

    “In American Stock Exchange composite trading, Citadel shares closed yesterday at $45.75, down 25 cents”.

  7. 7.

    Available at: https://github.com/FORMAS/DptOIE.

  8. 8.

    “Citadel shares closed at $45.75”.

  9. 9.

    “BBC correspondent Karyn Coleman reports from Kosovo.”.

  10. 10.

    “U.S. officials claim they already see signs Saddam Hussein is getting nervous.”.

References

  1. Ahn, D.: The stages of event extraction. In: Proceedings of the Workshop on Annotating and Reasoning about Time and Events, pp. 1–8 (2006)

    Google Scholar 

  2. Baker, C.F., Fillmore, C.J., Lowe, J.B.: The berkeley framenet project. In: 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, vol. 1, pp. 86–90 (1998)

    Google Scholar 

  3. Carvalho, P., Gonçalo Oliveira, H., Santos, D., Freitas, C., Mota, C.: Segundo harem: Modelo geral, novidades e avaliaçao. quot; In Cristina Mota; Diana Santos (ed) Desafios na avaliação conjunta do reconhecimento de entidades mencionadas: O Segundo HAREM Linguateca 2008 (2008)

    Google Scholar 

  4. Chen, Y., Xu, L., Liu, K., Zeng, D., Zhao, J.: Event extraction via dynamic multi-pooling convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 167–176 (2015). https://doi.org/10.3115/v1/P15-1017

  5. Cohen, J.: A coefficient of agreement for nominal scales. Educ. Psychol. Meas. 20(1), 37–46 (1960). https://doi.org/10.1177/001316446002000104

    Article  Google Scholar 

  6. Consortium, L.D.: Ace (automatic content extraction) English annotation guidelines for events. Version (5.4.3) (2005)

    Google Scholar 

  7. Costa, F., Branco, A.: Lx-timeanalyzer: a temporal information processing system for Portuguese (2012). http://hdl.handle.net/10451/14148

  8. Costa, F., Branco, A.: Timebankpt: a timeml annotated corpus of portuguese. In: LREC, pp. 3727–3734 (2012)

    Google Scholar 

  9. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding (2018). https://doi.org/10.18653/v1/N19-1423

  10. Ding, R., Li, Z.: Event extraction with deep contextualized word representation and multi-attention layer. In: Gan, G., Li, B., Li, X., Wang, S. (eds.) ADMA 2018. LNCS (LNAI), vol. 11323, pp. 189–201. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-05090-0_17

    Chapter  Google Scholar 

  11. Glauber, R., de Oliveira, L.S., Sena, C.F.L., Claro, D.B., Souza, M., et al.: Challenges of an annotation task for open information extraction in Portuguese. In: Villavicencio, A. (ed.) PROPOR 2018. LNCS (LNAI), vol. 11122, pp. 66–76. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99722-3_7

    Chapter  Google Scholar 

  12. Hinton, G., Srivastava, N., Swersky, K.: Neural networks for machine learning lecture 6a overview of mini-batch gradient descent. Cited on 14(8), 2 (2012)

    Google Scholar 

  13. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735

    Article  Google Scholar 

  14. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456. PMLR (2015)

    Google Scholar 

  15. Ji, H., Grishman, R.: Refining event extraction through cross-document inference. In: Proceedings of ACL-08: Hlt, pp. 254–262 (2008). https://aclanthology.org/P08-1030

  16. Li, Q., Ji, H., Huang, L.: Joint event extraction via structured prediction with global features. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 73–82. Association for Computational Linguistics (2013). https://www.aclweb.org/anthology/P13-1008

  17. Liu, J., Chen, Y., Liu, K., Zhao, J.: Event detection via gated multilingual attention mechanism. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  18. Nguyen, T.H., Cho, K., Grishman, R.: Joint event extraction via recurrent neural networks. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 300–309. Association for Computational Linguistics (2016). https://doi.org/10.18653/v1/N16-1034

  19. Nguyen, T.H., Grishman, R.: Event detection and domain adaptation with convolutional neural networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 365–371. Association for Computational Linguistics (2015). https://doi.org/10.3115/v1/P15-2060

  20. Nguyen, T.M., Nguyen, T.H.: One for all: Neural joint modeling of entities and events. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6851–6858 (2019)

    Google Scholar 

  21. Oliveira, L.D., Claro, D.B.: Dptoie: a Portuguese open information extraction system based on dependency analysis (2019)

    Google Scholar 

  22. Quaresma, P., Nogueira, V.B., Raiyani, K., Bayot, R.: Event extraction and representation: a case study for the Portuguese language. Information 10(6), 205 (2019). https://doi.org/10.3390/info10060205

    Article  Google Scholar 

  23. Saurı, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., Pustejovsky, J.: Timeml annotation guidelines version 1.2. 1 (2006)

    Google Scholar 

  24. Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Sig. Process. 45(11), 2673–2681 (1997). https://doi.org/10.1109/78.650093

    Article  Google Scholar 

  25. Souza, F., Nogueira, R., Lotufo, R.: BERTimbau: pretrained BERT models for Brazilian Portuguese. In: 9th Brazilian Conference on Intelligent Systems, BRACIS, Rio Grande do Sul, Brazil, October 20–23 (2020). https://doi.org/10.1007/978-3-030-61377-8_28

  26. Wadden, D., Wennberg, U., Luan, Y., Hajishirzi, H.: Entity, relation, and event extraction with contextualized span representations. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 5788–5793 (2019). https://doi.org/10.18653/v1/D19-1585

  27. Walker, C., Strassel, S., Medero, J., Maeda, K.: Ace 2005 multilingual training corpus. Linguist. Data Consortium Philadelphia 57, 45 (2006)

    Google Scholar 

  28. Wu, Y., et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)

  29. Xia, P., et al.: LOME: Large ontology multilingual extraction. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, pp. 149–159. Association for Computational Linguistics (2021). https://aclanthology.org/2021.eacl-demos.19

  30. Xiang, W., Wang, B.: A survey of event extraction from text. IEEE Access 7, 173111–173137 (2019)

    Article  Google Scholar 

Download references

Acknowledgments

Anderson da Silva Brito Sacramento would like to thank Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - CAPES for financial support (88887. 467864/2019-00).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marlo Souza .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sacramento, A.d.S.B., Souza, M. (2021). Joint Event Extraction with Contextualized Word Embeddings for the Portuguese Language. In: Britto, A., Valdivia Delgado, K. (eds) Intelligent Systems. BRACIS 2021. Lecture Notes in Computer Science(), vol 13074. Springer, Cham. https://doi.org/10.1007/978-3-030-91699-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-91699-2_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-91698-5

  • Online ISBN: 978-3-030-91699-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics