Generalizing over data sets: a preliminary study with BERT for Natural Language Inference

  • Rubem G. Nanclarez USP
  • Norton T. Roman USP
  • Fernando J. V. da Silva N2VEC Tecnologia


Natural language inference is the task of automatically identifying whether a given text (premise) implies another (hypothesis). Among multiple possible applications, it is especially relevant in the legal field to understand textual entailment between legal sentences, being the focus of recent research efforts. In this work, we evaluated the usage of BERT for natural language inference by conducting experiments and comparing results obtained by testing on a larger corpus with texts from multiple domains and a smaller corpus of legal sentences. Furthermore, we conducted a cross-experiment by training on the larger corpus and testing on the legal corpus. As a result, we obtained a mean accuracy of 88.91% in the corpus with multiple domains, a value comparable to related work. However, the same technique presented lower scores in the legal corpus and the cross-experiment.


Bos, J. and Markert, K. (2005). Recognising textual entailment with logical inference. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing HLT '05, pages 628-635, Vancouver, British Columbia, Canada. Association for Computational Linguistics.

Bowman, S. R., Angeli, G., Potts, C., and Manning, C. D. (2015). A large annotated corpus for learning natural language inference. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 632-642, Lisbon, Portugal. Association for Computational Linguistics.

Clark, K., Luong, M.-T., Le, Q. V., and Manning, C. D. (2020). ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv:1810.04805 [cs].

Du, Q., Zong, C., and Su, K.-Y. (2020). Conducting natural language inference with word-pair-dependency and local context. ACM Transactions on Asian and Low-Resource Language Information Processing, 19(3).

Ghuge, S. and Bhattacharya, A. (2014). Survey in Textual Entailment. Center for Indian Language Technology, page 28.

Goebel, J. R. R., Kano, Y., Kim, M.-Y., Yoshioka, M., and Satoh, K. (2021). Summary of the Competition on Legal Information Extraction/Entailment (COLIEE) 2021. Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment (COLIEE 2021).

Hoang, M., Bihorac, O. A., and Rouces, J. (2019). Aspect-Based Sentiment Analysis using BERT. In Proceedings of the 22nd Nordic Conference on Computational Linguistics, pages 187-196, Turku, Finland. Linköping University Electronic Press.

Imamura, K. and Sumita, E. (2019). Recycling a Pre-trained BERT Encoder for Neural Machine Translation. In Proceedings of the 3rd Workshop on Neural Generation and Translation, pages 23-31, Hong Kong. Association for Computational Linguistics.

Kim, M.-Y., Rabelo, J., and Goebel, R. (2021). BM25 and Transformer-based Legal Information Extraction and Entailment. Sao Paulo, page 6.

Lian, Z. and Lan, Y. (2019). Multi-layer attention neural network for sentence semantic matching. In Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2019, pages 421-426, New York, NY, USA. Association for Computing Machinery.

Nguyen, H.-T., Tran, V., Nguyen, P. M., Vuong, T.-H.-Y., Bui, Q. M., Nguyen, C. M., Dang, B. T., Nguyen, M. L., and Satoh, K. (2021). ParaLaw Nets - Cross-lingual Sentence-level Pretraining for Legal Text Processing. arXiv:2106.13403 [cs].

Qu, C., Yang, L., Qiu, M., Croft, W. B., Zhang, Y., and Iyyer, M. (2019). BERT with History Answer Embedding for Conversational Question Answering. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'19, pages 1133-1136, New York, NY, USA. Association for Computing Machinery.

Quamer, W., Jain, P. K., Rai, A., Saravanan, V., Pamula, R., and Kumar, C. (2021). SACNN: Self-attentive convolutional neural network model for natural language inference. ACM Transactions on Asian and Low-Resource Language Information Processing, 20(3).

Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2020). Exploring the Limits of Transfer Learning with a Unified Textto-Text Transformer.

Saini, N., Saha, S., Bhattacharyya, P., and Tuteja, H. (2020). Textual Entailment-Based figure summarization for biomedical articles. ACM Transactions on Multimedia Computing Communications and Applications, 16(1s).

Schilder, F., Chinnappa, D., Madan, K., Harmouche, J., Vold, A., Bretz, H., and Hudzina, J. (2021). A Pentapus Grapples with Legal Reasoning. page 9.

Trivedi, H., Kwon, H., Khot, T., Sabharwal, A., and Balasubramanian, N. (2019). Repurposing Entailment for Multi-Hop Question Answering Tasks.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, u., and Polosukhin, I. (2017). Attention is All you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, pages 6000-6010, Red Hook, NY, USA. Curran Associates Inc.

Yoshioka, M., Aoki, Y., and Suzuki, Y. (2021a). BERT-based ensemble methods with data augmentation for legal textual entailment in COLIEE statute law task. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, ICAIL '21, pages 278-284, New York, NY, USA. Association for Computing Machinery.

Yoshioka, M., Suzuki, Y., and Aoki, Y. (2021b). BERT-based Ensemble Methods for Information Retrieval and Legal Textual Entailment in COLIEE Statute Law Task. page 6.

Zhang, X., Xiao, C., Glass, L. M., and Sun, J. (2020). DeepEnroll: Patient-trial matching with deep embedding and entailment prediction. In Proceedings of the Web Conference 2020, WWW '20, pages 1029-1037, New York, NY, USA. Association for Computing Machinery.
NANCLAREZ, Rubem G.; ROMAN, Norton T.; SILVA, Fernando J. V. da. Generalizing over data sets: a preliminary study with BERT for Natural Language Inference. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 19. , 2022, Campinas/SP. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2022 . p. 602-611. ISSN 2763-9061. DOI: