Triple-VAE: A Triple Variational Autoencoder to Represent Events in One-Class Event Detection
Resumo
Events are phenomena that occur at a specific time and place. Its detection can bring benefits to society since it is possible to extract knowledge from these events. Event detection is a multimodal task since these events have textual, geographical, and temporal components. Most multimodal research in the literature uses the concatenation of the components to represent the events. These approaches use multi-class or binary learning to detect events of interest which intensifies the user's labeling effort, in which the user should label event classes even if there is no interest in detecting them. In this paper, we present the Triple-VAE approach that learns a unified representation from textual, spatial, and density modalities through a variational autoencoder, one of the state-ofthe-art in representation learning. Our proposed Triple-VAE obtains suitable event representations for one-class classification, where users provide labels only for events of interest, thereby reducing the labeling effort. We carried out an experimental evaluation with ten real-world event datasets, four multimodal representation methods, and five evaluation metrics. Triple-VAE outperforms and presents a statistically significant difference considering the other three representation methods in all datasets. Therefore, Triple-VAE proved to be promising to represent the events in the one-class event detection scenario.
Referências
Alam, S., Sonbhadra, S. K., Agarwal, S., and Nagabhushan, P. (2020). One-class support vector classifiers: A survey. Knowledge-Based Systems, 196:1–19.
Bekker, J. and Davis, J. (2020). Learning from positive and unlabeled data: a survey. Machine Learning, 1(Apr):1–45.
Bide, P. and Dhage, S. (2021). Similar event detection and event topic mining in social network platform. In 6th Int. Conf. for Convergence in Technology, pages 1–11. IEEE.
Blandfort, P., Patton, D. U., Frey, W. R., Karaman, S., Bhargava, S., Lee, F.-T., Varia, S., Kedzie, C., Gaskell, M. B., Schifanella, R., et al. (2019). Multimodal social media In Proc. of the Int. AAAI Conf. on web and analysis for gang violence prevention. social media, volume 13, pages 114–124.
Chen, X. and Li, Q. (2020). Event modeling and mining: a long journey toward explainable events. The VLDB Journal, 29(1):459–482.
Deng, S., Rangwala, H., and Ning, Y. (2020). Dynamic knowledge graph based multievent forecasting. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1585–1595.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). Bert: Pre-training of deep bidirectional transformers for language understanding. In NAACL 2019: North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4171–4186, Minnesota. ACL.
Fernández, A., García, S., Galar, M., Prati, R. C., Krawczyk, B., and Herrera, F. (2018). Learning from imbalanced data sets, volume 11. Springer.
Gôlo, M., Marcacini, R., and Rossi, R. (2019). An extensive empirical evaluation of preprocessing techniques and supervised one class learning algorithms for text classification. In ENIAC 2019: Proc. of the XVI Encontro Nacional de Inteligência Artificial e Computacional., pages 262–273, Brazil. SBC.
Kang, H.-W. and Kang, H.-B. (2017). Prediction of crime occurrence from multi-modal data using deep learning. PloS one, 12(4):e0176244.
Krawczyk, B., Wózniak, M., and Cyganek, B. (2014). Clustering-based ensembles for one-class classification. Information sciences, 264:182–195.
Otter, D. W., Medina, J. R., and Kalita, J. K. (2020). A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems, 32(2):604–624.
Radinsky, K. and Horvitz, E. (2013). Mining the web to predict future events. In Proceedings of the sixth ACM international conference on Web search and data mining, pages 255–264.
Reimers, N. and Gurevych, I. (2020). Making monolingual sentence embeddings multilingual using knowledge distillation. In Proc. of the 2020 Conf. on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics.
Setty, V. and Hose, K. (2018). Event2vec: Neural embeddings for news events. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, pages 1013–1016.
Sharma, S., Somayaji, A., and Japkowicz, N. (2018). Learning over subconcepts: Strategies for 1-class classification. Computational Intelligence, 34(2):440–467.
Tax, D. M. and Duin, R. P. (2004). Support vector data description. Machine Learning.
Tax, D. M. J. (2001). One-class classification: Concept learning in the absence of counter-examples. PhD thesis, Technische Universiteit Delft.
Trawinski, B., Smetek, M., Telec, Z., and Lasota, T. (2012). Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms. Applied Mathematics and Computer Science, 22(4):867–881.
Xu, J. and Durrett, G. (2018). Spherical latent spaces for stable variational autoencoders. In EMNLP 2018: Proc. Conf. on Empirical Methods in Natural Language Processing, pages 4503–4513, Belgium. Association for Computational Linguistics.
Zeppelzauer, M. and Schopfhauser, D. (2016). Multimodal classification of events in social media. Image and Vision Computing, 53:45–56.
Zhao, L. (2021). Event prediction in the big data era: A systematic survey. ACM Computing Surveys (CSUR), 54(5):1–37.
Zhou, H., Yin, H., Zheng, H., and Li, Y. (2020). A survey on multi-modal social event detection. Knowledge-Based Systems, 195:105695.