Less is More? Investigating Meta-Learning’s Suitability in Sentence Compression for Low-Resource Data
Resumo
The sentence compression task is essential in the text summarization process. Unfortunately, the lack of labeled data for specific domains restricts the training of deep learning models to address this problem effectively. In this paper, we present an approach using a meta-learning algorithm called MAML to tackle this issue and assess the viability of this technique for the given task, with particular emphasis on its comparison to a fine-tuned BERT model. Our experiments reveal that a simpler approach involving fine-tuning a language model, such as BERT, might indeed be more effective in low-resource scenarios, consistently outperforming the meta-learning techniques for this particular task.
Palavras-chave:
sentence compression, meta-learning, low-resource data
Referências
Bansal, T., Gunasekaran, K. P., Wang, T., Munkhdalai, T., and McCallum, A. (2021). Diverse distributions of self-supervised tasks for meta-learning in NLP. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5812–5824, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cer, D., Yang, Y., Kong, S.-y., Hua, N., Limtiaco, N., John, R. S., Constant, N., Guajardo-Cespedes, M., Yuan, S., Tar, C., et al. (2018). Universal sentence encoder. arXiv preprint arXiv:1803.11175.
Filippova, K., Alfonseca, E., Colmenares, C. A., Kaiser, L., and Vinyals, O. (2015). Sentence compression by deletion with LSTMs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 360–368, Lisbon, Portugal. Association for Computational Linguistics. https://doi.org/10.18653/v1/D15-1042
Filippova, K. and Altun, Y. (2013). Overcoming the lack of parallel data in sentence compression. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1481–1491, Seattle, Washington, USA. Association for Computational Linguistics.
Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, pages 1126–1135. PMLR. https://doi.org/10.48550/arXiv.1703.03400
Gu, J., Wang, Y., Chen, Y., Li, V. O. K., and Cho, K. (2018). Meta-learning for low-resource neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3622–3631, Brussels, Belgium. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.1808.08437
Kamigaito, H., Hayashi, K., Hirao, T., and Nagata, M. (2018). Higher-order syntactic attention network for longer sentence compression. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1716–1726, New Orleans, Louisiana. Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-1155
Kamigaito, H. and Okumura, M. (2020). Syntactically look-ahead attention network for sentence compression. In Proceedings of the AAAI, volume 34, pages 8050–8057. https://doi.org/10.48550/arXiv.2002.01145
Kenton, J. D. M.-W. C. and Toutanova, L. K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186. https://doi.org/10.48550/arXiv.1810.04805
Lee, H.-y., Li, S.-W., and Vu, T. (2022). Meta learning for natural language processing: A survey. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 666–684, Seattle, United States. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2205.01500
Li, J., Shang, S., and Shao, L. (2020). Metaner: Named entity recognition with meta-learning. In Proceedings of The Web Conference 2020, pages 429–440. https://doi.org/10.1145/3366423.3380127
Ma, X., Xu, P., Wang, Z., Nallapati, R., and Xiang, B. (2019). Domain adaptation with bert-based domain classification and data selection. In Proceedings of the 2nd DeepLo, pages 76–83. https://doi.org/10.18653/v1/D19-6109
Mi, F., Huang, M., Zhang, J., and Faltings, B. (2019). Meta-learning for low-resource natural language generation in task-oriented dialogue systems. In Proceedings of the 28th IJCAI, pages 3151–3157. https://doi.org/10.48550/arXiv.1905.05644
Qian, K. and Yu, Z. (2019). Domain adaptive dialog generation via meta learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2639–2649, Florence, Italy. Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1253
Soares, F. M., da Silva, T. L. C., and de Macêdo, J. F. (2020). Sentence compression on domains with restricted labeled data. In Proceedings of the 12th ICAART, pages 130–140. https://doi.org/10.5220/0008958301300140
Song, Y., Liu, Z., Bi, W., Yan, R., and Zhang, M. (2019). Learning to customize language model for generation-based dialog systems. CoRR, abs/1910.14326. https://doi.org/10.48550/arXiv.1910.14326
Tas, O. and Kiyani, F. (2007). A survey automatic text summarization. PressAcademia Procedia, 5(1):205–213. https://doi.org/10.1109/ACCESS.2021.3129786
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al. (2016). Matching networks for one shot learning. NeurIPS, 29. https://doi.org/10.48550/arXiv.1606.04080
Yu, M., Guo, X., Yi, J., Chang, S., Potdar, S., Cheng, Y., Tesauro, G., Wang, H., and Zhou, B. (2018). Diverse few-shot text classification with multiple metrics. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1206–1215, New Orleans, Louisiana. Association for Computational Linguistics.
Cer, D., Yang, Y., Kong, S.-y., Hua, N., Limtiaco, N., John, R. S., Constant, N., Guajardo-Cespedes, M., Yuan, S., Tar, C., et al. (2018). Universal sentence encoder. arXiv preprint arXiv:1803.11175.
Filippova, K., Alfonseca, E., Colmenares, C. A., Kaiser, L., and Vinyals, O. (2015). Sentence compression by deletion with LSTMs. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 360–368, Lisbon, Portugal. Association for Computational Linguistics. https://doi.org/10.18653/v1/D15-1042
Filippova, K. and Altun, Y. (2013). Overcoming the lack of parallel data in sentence compression. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 1481–1491, Seattle, Washington, USA. Association for Computational Linguistics.
Finn, C., Abbeel, P., and Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In ICML, pages 1126–1135. PMLR. https://doi.org/10.48550/arXiv.1703.03400
Gu, J., Wang, Y., Chen, Y., Li, V. O. K., and Cho, K. (2018). Meta-learning for low-resource neural machine translation. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3622–3631, Brussels, Belgium. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.1808.08437
Kamigaito, H., Hayashi, K., Hirao, T., and Nagata, M. (2018). Higher-order syntactic attention network for longer sentence compression. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1716–1726, New Orleans, Louisiana. Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-1155
Kamigaito, H. and Okumura, M. (2020). Syntactically look-ahead attention network for sentence compression. In Proceedings of the AAAI, volume 34, pages 8050–8057. https://doi.org/10.48550/arXiv.2002.01145
Kenton, J. D. M.-W. C. and Toutanova, L. K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT, pages 4171–4186. https://doi.org/10.48550/arXiv.1810.04805
Lee, H.-y., Li, S.-W., and Vu, T. (2022). Meta learning for natural language processing: A survey. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 666–684, Seattle, United States. Association for Computational Linguistics. https://doi.org/10.48550/arXiv.2205.01500
Li, J., Shang, S., and Shao, L. (2020). Metaner: Named entity recognition with meta-learning. In Proceedings of The Web Conference 2020, pages 429–440. https://doi.org/10.1145/3366423.3380127
Ma, X., Xu, P., Wang, Z., Nallapati, R., and Xiang, B. (2019). Domain adaptation with bert-based domain classification and data selection. In Proceedings of the 2nd DeepLo, pages 76–83. https://doi.org/10.18653/v1/D19-6109
Mi, F., Huang, M., Zhang, J., and Faltings, B. (2019). Meta-learning for low-resource natural language generation in task-oriented dialogue systems. In Proceedings of the 28th IJCAI, pages 3151–3157. https://doi.org/10.48550/arXiv.1905.05644
Qian, K. and Yu, Z. (2019). Domain adaptive dialog generation via meta learning. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 2639–2649, Florence, Italy. Association for Computational Linguistics. https://doi.org/10.18653/v1/P19-1253
Soares, F. M., da Silva, T. L. C., and de Macêdo, J. F. (2020). Sentence compression on domains with restricted labeled data. In Proceedings of the 12th ICAART, pages 130–140. https://doi.org/10.5220/0008958301300140
Song, Y., Liu, Z., Bi, W., Yan, R., and Zhang, M. (2019). Learning to customize language model for generation-based dialog systems. CoRR, abs/1910.14326. https://doi.org/10.48550/arXiv.1910.14326
Tas, O. and Kiyani, F. (2007). A survey automatic text summarization. PressAcademia Procedia, 5(1):205–213. https://doi.org/10.1109/ACCESS.2021.3129786
Vinyals, O., Blundell, C., Lillicrap, T., Wierstra, D., et al. (2016). Matching networks for one shot learning. NeurIPS, 29. https://doi.org/10.48550/arXiv.1606.04080
Yu, M., Guo, X., Yi, J., Chang, S., Potdar, S., Cheng, Y., Tesauro, G., Wang, H., and Zhou, B. (2018). Diverse few-shot text classification with multiple metrics. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1206–1215, New Orleans, Louisiana. Association for Computational Linguistics.
Publicado
25/09/2023
Como Citar
DO R., L. Gustavo Coutinho; DE MACÊDO, José Antônio F.; DA SILVA, Ticiana L. Coelho.
Less is More? Investigating Meta-Learning’s Suitability in Sentence Compression for Low-Resource Data. In: SIMPÓSIO BRASILEIRO DE TECNOLOGIA DA INFORMAÇÃO E DA LINGUAGEM HUMANA (STIL), 14. , 2023, Belo Horizonte/MG.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2023
.
p. 1-10.
DOI: https://doi.org/10.5753/stil.2023.233175.