Text Representation through Multimodal Variational Autoencoder for One-Class Learning

Marcos Paulo Silva Gôlo; Ricardo Marcondes Marcacini

doi:10.5753/webmedia_estendido.2023.233158

Marcos Paulo Silva Gôlo USP http://orcid.org/0000-0002-9093-8195
Ricardo Marcondes Marcacini USP

DOI: https://doi.org/10.5753/webmedia_estendido.2023.233158

Resumo

Multi-class learning (MCL) methods perform Automatic Text Classification (ATC), which requires labeling for all classes. MCL fails when there is no well-defined class information and requires a great eff ort in labeling. One-Class Learning (OCL) can mitigate these limitations since the training only has instances from one class, reducing the labeling eff ort and making the ATC more appropriate for open-domain applications. However, OCL is more challenging due to the lack of counterexamples. Even so, most studies use unimodal representations, even though different domains contain other information (modalities). Thus, this study proposes the Multimodal Variational Autoencoder (MVAE) for OCL. MVAE is a multimodal method that learns a new representation from more than one modality, capturing the characteristics of the interest class in an adequate way. MVAE explores semantic, density, linguistic, and spatial information modalities. The main contribution is a new multimodal method for representation learning on OCL scenarios considering few instances to train with state-of-the-art results in three domains.

Palavras-chave: Text Classification, One-Class Learning, Multi-modal Variational Autoencoder

Referências

Charu Aggarwal. 2018. Machine Learning for Text. Springer Publishing Company, online.

Shamshe Alam, Sanjay Kumar Sonbhadra, Sonali Agarwal, and P Nagabhushan. 2020. One-class support vector classifiers: A survey. Knowledge-Based Systems 196 (2020), 1–19.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proc. of the 2019 Conf. of the North American Chapter of the ACL. ACL, Minnesota, 4171–4186.

Jing Gao, Peng Li, Zhikui Chen, and Jianing Zhang. 2020. A survey on deep learning for multimodal data fusion. Neural Computation 32, 5 (2020), 829–864.

Marcos Gôlo, Adailton Araújo, Rafael Rossi, and Ricardo Marcacini. 2022. Detecting relevant app reviews for software evolution and maintenance through multimodal one-class learning. Information and Software Technology 151 (2022), 106998.

Marcos Gôlo, Mariana Caravanti, Rafael Rossi, Solange Rezende, Bruno Nogueira, and Ricardo Marcacini. 2021. Learning Textual Representations from Multiple Modalities to Detect Fake News Through One-Class Learning. In Proc. of the Brazilian Symposium on Multimedia and the Web. ACM, online, 197–204.

Marcos Gôlo, Mariana de Souza, Rafael Rossi, Solange Rezende, Bruno Nogueira, and Ricardo Marcacini. 2023. One-class learning for fake news detection through multimodal variational autoencoders. Engineering Applications of Artificial Intelligence 122 (2023), 106088.

Marcos Gôlo, Ricardo Marcacini, and Rafael Rossi. 2019. An extensive empirical evaluation of preprocessing techniques and supervised one class learning algorithms for text classification. In Proc. of the National Meeting on Artificial and Computational Intelligence. SBC, Salvador, Brazil, 262–273.

Marcos Gôlo, Rafael Rossi, and Ricardo Marcacini. 2021. Learning to sense from events via semantic variational autoencoder. Plos one 16, 12 (2021), e0260701.

Marcos Gôlo, Rafael Rossi, and Ricardo Marcacini. 2021. Triple-VAE: A Triple Variational Autoencoder to Represent Events in One-Class Event Detection. In Proceeding of the 2021 National Meeting on Artificial and Computational Intelligence. SBC, online, 643–654.

Wenzhong Guo, Jianwen Wang, and Shiping Wang. 2019. Deep multimodal representation learning: A survey. IEEE Access 7 (2019), 63373–63394.

D. Junior and R. Rossi. 2017. Classificaçao automática de textos utilizando aprendizado supervisionado baseado em uma unica classe. TCC em Sistemas de Informação 1, 1 (2017), 24 pages.

Aggelos K Katsaggelos, Sara Bahaadini, and Rafael Molina. 2015. Audiovisual fusion: Challenges and new approaches. IEEE 103, 9 (2015), 1635–1653.

B Shravan Kumar and Vadlamani Ravi. 2017. Text Document Classification with PCA and One-Class SVM. In Int. Conf. on Frontiers in Intel. Computing: Theory and Applications. Springer, Odisa, India, 107–115.

Yingming Li, Ming Yang, and Zhongfei Zhang. 2018. A survey of multi-view representation learning. IEEE transactions on knowledge and data engineering 31, 10 (2018), 1863–1883.

Larry Manevitz and Malik Yousef. 2007. One-class document classification via neural networks. Neurocomputing 70, 7-9 (2007), 1466–1481.

Hemanth Kumar Reddy Mayaluru. 2020. One Class Text Classification using an Ensemble of Classifiers. Ph.D. Dissertation. Rheinische Friedrich-Wilhelms-Universität Bonn.

Peter Rousseeuw. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics 20 (1987), 53–65.

Lukas Ruff, Yury Zemlyanskiy, Robert Vandermeulen, Thomas Schnake, and Marius Kloft. 2019. Self-attentive, multi-context oneclass classification for unsupervised anomaly detection on text. In Proc. of the Meeting of the Association for Computational Linguistics. ACL, Florence, Italy, 4061–4071.

Christoph Stanik, Marlo Haering, and Walid Maalej. 2019. Classifying multilingual user feedback using traditional machine learning and deep learning. In Int. Conf. Requirements Engineering. IEEE, Jeju, Korea, 220–226.

David MJ Tax and Robert PW Duin. 2004. Support vector data description. Machine learning 54 (2004), 45–66.

David Martinus Johannes Tax. 2001. One-class classification: Concept learning in the absence of counter-examples. Ph.D. Dissertation. Technische Universiteit Delft.

Hongzhi Wang, Mohamed Jaward Bah, and Mohamed Hammad. 2019. Progress in outlier detection techniques: A survey. IEEE Access 7 (2019), 107964–108000.

Jiacheng Xu and Greg Durrett. 2018. Spherical Latent Spaces for Stable Variational Autoencoders. In Proc. of the Conf. on Empirical Methods in NLP. ACL, Brussels, Belgium, 4503–4513.

Han Zhou, Hongpeng Yin, Hengyi Zheng, and Yanxia Li. 2020. A survey on multi-modal social event detection. Knowledge-Based Systems 195 (2020), 105695.