Text Representation through Multimodal Variational Autoencoder for One-Class Learning

  • Marcos Paulo Silva Gôlo USP
  • Ricardo Marcondes Marcacini USP


Multi-class learning (MCL) methods perform Automatic Text Classification (ATC), which requires labeling for all classes. MCL fails when there is no well-defined information about the classes and requires a great effort to label instances. One-Class Learning (OCL) can mitigate these limitations since the training only has instances from one class, reducing the labeling effort and making the ATC more appropriate for open-domain applications. However, OCL is more challenging due to the lack of counterexamples for model training, requiring more robust representations. However, most studies use unimodal representations, even though different domains contain other information that can be used as modalities. Thus, this study proposes the Multimodal Variational Autoencoder (MVAE) for OCL. MVAE is a multimodal method that learns a new representation from more than one modality, capturing the characteristics of the interest class in an adequate way. MVAE explores semantic, density, linguistic, and spatial information modalities. The main contributions are: (i) a multimodal method for ATC through OCL; (ii) MVAE for fake news detection; (iii) relevant reviews detection via MVAE; and (iv) sensing events through MVAE.


Aggarwal, C. (2018). Machine Learning for Text. Springer Publishing Company.

Alam, S., Sonbhadra, S. K., Agarwal, S., and Nagabhushan, P. (2020). One-class support vector classifiers: A survey. Knowledge-Based Systems, 196:1–19.

Cichosz, P. (2020). Unsupervised modeling anomaly detection in discussion forums posts using global vectors for text representation. Natural Language Engineering.

Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proc. of the 2019 Conf. of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minnesota. Association for Computational Linguistics.

Gao, J., Li, P., Chen, Z., and Zhang, J. (2020). A survey on deep learning for multimodal data fusion. Neural Computation, 32(5):829–864.

Gôlo, M., Caravanti, M., Rossi, R., Rezende, S., Nogueira, B., and Marcacini, R. (2021a). Learning textual representations from multiple modalities to detect fake news through one-class learning. In Proc. of the Brazilian Symposium on Multimedia and the Web.

Gôlo, M., Marcacini, R., and Rossi, R. (2019). An extensive empirical evaluation of preprocessing techniques and supervised one class learning algorithms for text classification. In Proc. of the National Meeting on Artificial and Computational Intelligence.

Gôlo, M. P., Araújo, A. F., Rossi, R. G., and Marcacini, R. M. (2022). Detecting relevant app reviews for software evolution and maintenance through multimodal one-class learning. Information and Software Technology, 151:106998.

Gôlo, M. P., Rossi, R. G., and Marcacini, R. M. (2021b). Triple-vae: A triple variational autoencoder to represent events in one-class event detection. In Proceeding of the 2021 National Meeting on Artificial and Computational Intelligence., pages 643–654. SBC.

Gôlo, M. P. S., Rossi, R. G., and Marcacini, R. M. (2021c). Learning to sense from events via semantic variational autoencoder. Plos one, 16(12):e0260701.

Guo, W., Wang, J., and Wang, S. (2019). Deep multimodal representation learning: A survey. IEEE Access, 7:63373–63394.

Junior, D. and Rossi, R. (2017). Classificaçao automática de textos utilizando aprendizado supervisionado baseado em uma unica classe. TCC em Sistemas de Informação.

Katsaggelos, A. K., Bahaadini, S., and Molina, R. (2015). Audiovisual fusion: Challenges and new approaches. IEEE, 103(9):1635–1653.

Kumar, B. and Ravi, V. (2017a). One-class text document classification with OCSVM and LSI. In Art. Intel. & Evolutionary Computations in Eng. Systems. Springer.

Kumar, B. S. and Ravi, V. (2017b). Text document classification with PCA and one-class SVM. In Proc. Int. Conf. on Frontiers in Intel. Computing: Theory and Applications.

Li, Y., Yang, M., and Zhang, Z. (2018). A survey of multi-view representation learning. IEEE transactions on knowledge and data engineering, 31(10):1863–1883.

Manevitz, L. and Yousef, M. (2007). One-class document classification via neural networks. Neurocomputing, 70(7-9):1466–1481.

Manevitz, L. M. and Yousef, M. (2001). One-class svms for document classification. Journal of machine Learning research, 2(Dec):139–154.

Mayaluru, H. K. R. (2020). One Class Text Classification using an Ensemble of Classifiers. PhD thesis, Rheinische Friedrich-Wilhelms-Universität Bonn.

Rousseeuw, P. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65.

Ruff, L., Zemlyanskiy, Y., Vandermeulen, R., Schnake, T., and Kloft, M. (2019). Self-attentive, multi-context one-class classification for unsupervised anomaly detection on text. In Proc. of the Meeting of the Association for Computational Linguistics.

Stanik, C., Haering, M., and Maalej, W. (2019). Classifying multilingual user feedback using traditional machine learning and deep learning. In Int. Conf. Requirements Engineering.

Tax, D. and Duin, R. (2004). Support vector data description. Machine Learning.

Tax, D. M. J. (2001). One-class classification: Concept learning in the absence of counter-examples. PhD thesis, Technische Universiteit Delft.

Wang, H., Bah, M. J., and Hammad, M. (2019). Progress in outlier detection techniques: A survey. IEEE Access, 7:107964–108000.

Xu, J. and Durrett, G. (2018). Spherical latent spaces for stable variational autoencoders. In Proc. of the Conf. on Empirical Methods in NLP. ACL.

Zhou, H., Yin, H., Zheng, H., and Li, Y. (2020). A survey on multi-modal social event detection. Knowledge-Based Systems, 195:105695.
GÔLO, Marcos Paulo Silva; MARCACINI, Ricardo Marcondes. Text Representation through Multimodal Variational Autoencoder for One-Class Learning. In: CONCURSO DE TESES E DISSERTAÇÕES (CTD), 36. , 2023, João Pessoa/PB. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2023 . p. 148-157. ISSN 2763-8820. DOI: https://doi.org/10.5753/ctd.2023.229471.