Less is More? Investigating Meta-Learning’s Suitability in Sentence Compression for Low-Resource Data

Resumo


The sentence compression task is essential in the text summarization process. Unfortunately, the lack of labeled data for specific domains restricts the training of deep learning models to address this problem effectively. In this paper, we present an approach using a meta-learning algorithm called MAML to tackle this issue and assess the viability of this technique for the given task, with particular emphasis on its comparison to a fine-tuned BERT model. Our experiments reveal that a simpler approach involving fine-tuning a language model, such as BERT, might indeed be more effective in low-resource scenarios, consistently outperforming the meta-learning techniques for this particular task.
Palavras-chave: sentence compression, meta-learning, low-resource data

Referências

Bansal, T., Gunasekaran, K. P., Wang, T., Munkhdalai, T., and McCallum, A. (2021). Diverse distributions of self-supervised tasks for meta-learning in NLP. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5812–5824, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.

Cer, D., Yang, Y., Kong, S.-y., Hua, N., Limtiaco, N., John, R. S., Constant, N., Guajardo-Cespedes, M., Yuan, S., Tar, C., et al. (2018). Universal sentence encoder. arXiv preprint arXiv:1803.11175.

Filippova, K., Alfonseca, E., Colmenares, C. A., Kaiser, L., and Vinyals, O. (2015). Sentence compression by deletion with LSTMs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 360–368, Lisbon, Portugal. Association for Computational Linguistics. https://doi.org/10.18653/v1/D15-1042

Filippova, K. and Altun, Y. (2013). Overcoming the lack of parallel data in sentence compression. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1481–1491, Seattle, Washington, USA. Association for Computational Linguistics.

Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, pages 1126–1135. PMLR. https://doi.org/10.48550/arXiv.1703.03400

Gu, J., Wang, Y., Chen, Y., Li, V. O. K., and Cho, K. (2018). Meta-learning for low-resource neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3622–3631, Brussels, Belgium. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.1808.08437

Kamigaito, H., Hayashi, K., Hirao, T., and Nagata, M. (2018). Higher-order syntactic attention network for longer sentence compression. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1716–1726, New Orleans, Louisiana. Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-1155

Kamigaito, H. and Okumura, M. (2020). Syntactically look-ahead attention network for sentence compression. In Proceedings of the AAAI, volume 34, pages 8050–8057. https://doi.org/10.48550/arXiv.2002.01145

Kenton, J. D. M.-W. C. and Toutanova, L. K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186. https://doi.org/10.48550/arXiv.1810.04805

Lee, H.-y., Li, S.-W., and Vu, T. (2022). Meta learning for natural language processing: A survey. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 666–684, Seattle, United States. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2205.01500

Li, J., Shang, S., and Shao, L. (2020). Metaner: Named entity recognition with meta-learning. In Proceedings of The Web Conference 2020, pages 429–440. https://doi.org/10.1145/3366423.3380127

Ma, X., Xu, P., Wang, Z., Nallapati, R., and Xiang, B. (2019). Domain adaptation with bert-based domain classification and data selection. In Proceedings of the 2nd DeepLo, pages 76–83. https://doi.org/10.18653/v1/D19-6109

Mi, F., Huang, M., Zhang, J., and Faltings, B. (2019). Meta-learning for low-resource natural language generation in task-oriented dialogue systems. In Proceedings of the 28th IJCAI, pages 3151–3157. https://doi.org/10.48550/arXiv.1905.05644

Qian, K. and Yu, Z. (2019). Domain adaptive dialog generation via meta learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2639–2649, Florence, Italy. Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1253

Soares, F. M., da Silva, T. L. C., and de Macêdo, J. F. (2020). Sentence compression on domains with restricted labeled data. In Proceedings of the 12th ICAART, pages 130–140. https://doi.org/10.5220/0008958301300140

Song, Y., Liu, Z., Bi, W., Yan, R., and Zhang, M. (2019). Learning to customize language model for generation-based dialog systems. CoRR, abs/1910.14326. https://doi.org/10.48550/arXiv.1910.14326

Tas, O. and Kiyani, F. (2007). A survey automatic text summarization. PressAcademia Procedia, 5(1):205–213. https://doi.org/10.1109/ACCESS.2021.3129786

Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al. (2016). Matching networks for one shot learning. NeurIPS, 29. https://doi.org/10.48550/arXiv.1606.04080

Yu, M., Guo, X., Yi, J., Chang, S., Potdar, S., Cheng, Y., Tesauro, G., Wang, H., and Zhou, B. (2018). Diverse few-shot text classification with multiple metrics. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1206–1215, New Orleans, Louisiana. Association for Computational Linguistics.
Publicado
25/09/2023
DO R., L. Gustavo Coutinho; DE MACÊDO, José Antônio F.; DA SILVA, Ticiana L. Coelho. Less is More? Investigating Meta-Learning’s Suitability in Sentence Compression for Low-Resource Data. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 14. , 2023, Belo Horizonte/MG. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 1-10. DOI: https://doi.org/10.5753/stil.2023.233175.