skip to main content
10.1145/3617023.3617041acmotherconferencesArticle/Chapter ViewAbstractPublication PageswebmediaConference Proceedingsconference-collections
research-article

On the Use of Early Fusion Operators on Heterogeneous Graph Neural Networks for One-Class Learning

Published:23 October 2023Publication History

ABSTRACT

Multimodal data fusion generates robust and unified representations considering supplementary and complementary information from different modalities, such as audio, image, and text. Different strategies for data fusion have been explored for decades, from simple concatenation-based strategies of the modalities’ features to the use of vector fusion operators (sum, average, subtraction, multiplication, etc.) between feature vectors in latent spaces of each modality. However, existing studies do not investigate multimodal fusion operators for heterogeneous graphs, which are powerful representations for modeling real-world data through a powerful structure that considers the different relations between different node types. Those representations are suited for important multimedia-related tasks, such as classification, recommendation, summarization, web sensing, and content-based retrieval. This paper presents a Graph Neural Network (GNN) method for heterogeneous graphs that explores different types of early fusion operators to deal with multiple modalities. Moreover, we evaluated the proposal’s performance with different early fusion operators considering one-class learning, a popular learning approach for real-world applications. A statistical analysis of the experimental results shows that early fusion operators improve the f1-Score when considering GNNs from heterogeneous graphs. We highlight the subtraction, multiplication, and minimum operators outperforming the other operators. Thus, we argue that our early-fusion operators’ proposal in heterogeneous graph neural networks leads to improved performance and is also a competitive alternative to the well-often-used concatenation technique or costly hand-based approaches of combining different modalities.

References

  1. Shamshe Alam, Sanjay Kumar Sonbhadra, Sonali Agarwal, and P Nagabhushan. 2020. One-class support vector classifiers: A survey. Knowledge-Based Systems 196 (2020), 105754. https://doi.org/10.1016/j.knosys.2020.105754Google ScholarGoogle ScholarCross RefCross Ref
  2. Pradeep K Atrey, M Anwar Hossain, Abdulmotaleb El Saddik, and Mohan S Kankanhalli. 2010. Multimodal fusion for multimedia analysis: a survey. Multimedia systems 16 (2010), 345–379. https://doi.org/10.1007/s00530-010-0182-0Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency. 2018. Multimodal machine learning: A survey and taxonomy. IEEE transactions on pattern analysis and machine intelligence 41, 2 (2018), 423–443. https://doi.org/10.1109/TPAMI.2018.2798607Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Antonio AR Beserra and Rudinei Goularte. 2023. Multimodal early fusion operators for temporal video scene segmentation tasks. Multimedia Tools and Applications 82 (2023), 1–18. https://doi.org/10.1007/s11042-023-14953-6Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Antonio AR Beserra, Rodrigo M Kishi, and Rudinei Goularte. 2020. Evaluating Early Fusion Operators at Mid-Level Feature Space. In Proceedings of the Brazilian Symposium on Multimedia and the Web. ACM, online, 113–120. https://doi.org/10.1145/3428658.3431079Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Antonio Alessandro Rocha Beserra. 2022. Operadores de fusão prévia para segmentação temporal de vídeo em cenas. Master’s thesis. Universidade de São Paulo. https://www.teses.usp.br/teses/disponiveis/55/55134/tde-07022023-152229/en.phpGoogle ScholarGoogle Scholar
  7. Angelo da Silva, Marcos Gôlo, and Ricardo Marcacini. 2023. Unsupervised Heterogeneous Graph Neural Network for Hit Song Prediction through One Class Learning. In 10th Symposium on Knowledge Discovery, Mining and Learning (KDMiLe). SBC, Campinas, SP, Brazil, –. https://doi.org/10.5753/kdmile.2022.227954Google ScholarGoogle ScholarCross RefCross Ref
  8. Mariana Caravanti de Souza, Bruno Magalhães Nogueira, Rafael Geraldeli Rossi, Ricardo Marcondes Marcacini, Brucce Neves Dos Santos, and Solange Oliveira Rezende. 2022. A network-based positive and unlabeled learning approach for fake news detection. Machine Learning 111, 10 (2022), 3549–3592. https://doi.org/10.1007/s10994-021-06111-6Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mariana C de Souza, Bruno M Nogueira, Rafael G Rossi, Ricardo M Marcacini, and Solange O Rezende. 2021. A Heterogeneous Network-Based Positive and Unlabeled Learning Approach to Detect Fake News. In Intelligent Systems: 10th Brazilian Conference, BRACIS 2021, Virtual Event, November 29–December 3, 2021, Proceedings, Part II. Springer, online, 3–18. https://doi.org/10.1007/978-3-030-91699-2_1Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL 2019: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423Google ScholarGoogle ScholarCross RefCross Ref
  11. Paulo do Carmo and Ricardo Marcacini. 2021. Embedding propagation over heterogeneous event networks for link prediction. In 2021 IEEE International Conference on Big Data (Big Data). IEEE, online, 4812–4821. https://doi.org/10.1109/BigData52589.2021.9671645Google ScholarGoogle ScholarCross RefCross Ref
  12. Frank Emmert-Streib and Matthias Dehmer. 2022. Taxonomy of machine learning paradigms: A data-centric perspective. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 12, 5 (2022), e1470. https://doi.org/10.1002/widm.1470Google ScholarGoogle ScholarCross RefCross Ref
  13. Tom Ganz, Inaam Ashraf, Martin Härterich, and Konrad Rieck. 2023. Detecting Backdoors in Collaboration Graphs of Software Repositories. In Proceedings of the Thirteenth Conference on Data and Application Security and Privacy. ACM, Charlotte, NC, USA, 189–200. https://doi.org/10.1145/3577923.3583657Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Marcos Gôlo, Mariana Caravanti, Rafael Rossi, Solange Rezende, Bruno Nogueira, and Ricardo Marcacini. 2021. Learning textual representations from multiple modalities to detect fake news through one-class learning. In Proceedings of the Brazilian Symposium on Multimedia and the Web. ACM, Online, 197–204. https://doi.org/10.1145/3470482.3479634Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Marcos Paulo Silva Gôlo, Mariana Caravanti de Souza, Rafael Geraldeli Rossi, Solange Oliveira Rezende, Bruno Magalhães Nogueira, and Ricardo Marcondes Marcacini. 2023. One-class learning for fake news detection through multimodal variational autoencoders. Engineering Applications of Artificial Intelligence 122 (2023), 106088. https://doi.org/10.1016/j.engappai.2023.106088Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Qingyu Guo, Fuzhen Zhuang, Chuan Qin, Hengshu Zhu, Xing Xie, Hui Xiong, and Qing He. 2020. A survey on knowledge graph-based recommender systems. IEEE Transactions on Knowledge and Data Engineering 34, 8 (2020), 3549–3568. https://doi.org/10.1109/TKDE.2020.3028705Google ScholarGoogle ScholarCross RefCross Ref
  17. Wenzhong Guo, Jianwen Wang, and Shiping Wang. 2019. Deep multimodal representation learning: A survey. IEEE Access 7 (2019), 63373–63394. https://doi.org/10.1109/ACCESS.2019.2916887Google ScholarGoogle ScholarCross RefCross Ref
  18. Marcos Gôlo, Leonardo Moraes, Rudinei Goularte, and Ricardo Marcacini. 2023. One-Class Recommendation through Unsupervised Graph Neural Networks for Link Prediction. In 10th Symposium on Knowledge Discovery, Mining and Learning (KDMiLe). SBC, campinas, SP, Brazil, –. https://doi.org/10.5753/kdmile.2022.227810Google ScholarGoogle ScholarCross RefCross Ref
  19. Zeqi Huang, Yonghao Gu, and Qing Zhao. 2022. One-Class Directed Heterogeneous Graph Neural Network for Intrusion Detection. In 6th International Conference on Innovation in Artificial Intelligence (ICIAI). ACM, Guangzhou, China, 178–184. https://doi.org/10.1145/3529466.3529480Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Peter Jakob, Manav Madan, Tobias Schmid-Schirling, and Abhinav Valada. 2021. Multi-perspective anomaly detection. Sensors 21, 16 (2021), 5311. https://doi.org/10.3390/s21165311Google ScholarGoogle ScholarCross RefCross Ref
  21. Shehroz S Khan and Michael G Madden. 2014. One-class classification: taxonomy of study and review of techniques. The Knowledge Engineering Review 29, 3 (2014), 345–374. https://doi.org/10.1017/S026988891300043XGoogle ScholarGoogle ScholarCross RefCross Ref
  22. Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. In NIPS Workshop on Bayesian Deep Learning. NIPS, Barcelona, Spain, 1–3. http://bayesiandeeplearning.org/2016/papers/BDL_16.pdfGoogle ScholarGoogle Scholar
  23. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR). OpenReview, Toulon, France, 1–14. https://openreview.net/forum?id=SJU4ayYglGoogle ScholarGoogle Scholar
  24. Ashnil Kumar, Jinman Kim, Weidong Cai, Michael Fulham, and Dagan Feng. 2013. Content-based medical image retrieval: a survey of applications to multidimensional and multimodality data. Journal of digital imaging 26 (2013), 1025–1039. https://doi.org/10.1007/s10278-013-9619-2Google ScholarGoogle ScholarCross RefCross Ref
  25. Xiaojing Liu, Feiyu Gao, Qiong Zhang, and Huasha Zhao. 2019. Graph Convolution for Multimodal Information Extraction from Visually Rich Documents. In Proceedings of NAACL-HLT. Association for Computational Linguistics, Minneapolis, Minnesota, 32–39. https://doi.org/10.18653/v1/N19-2005Google ScholarGoogle ScholarCross RefCross Ref
  26. Joao Pedro Rodrigues Mattos and Ricardo M Marcacini. 2021. Semi-Supervised Graph Attention Networks for Event Representation Learning. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, online, 1234–1239. https://doi.org/10.1109/ICDM51629.2021.00150Google ScholarGoogle ScholarCross RefCross Ref
  27. Thien Nguyen and Ralph Grishman. 2018. Graph convolutional networks with argument-aware pooling for event detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32. AAAI, Vancouver, Canada, 5900–5907. https://doi.org/10.1609/aaai.v32i1.12039Google ScholarGoogle ScholarCross RefCross Ref
  28. Daniel Otter, Julian Medina, and Jugal Kalita. 2020. A survey of the usages of deep learning for natural language processing. IEEE Transactions on Neural Networks and Learning Systems 32, 2 (2020), 604–624. https://doi.org/10.1109/TNNLS.2020.2979670Google ScholarGoogle ScholarCross RefCross Ref
  29. Md Saidur Rahman. 2017. Basic graph theory. Vol. 9. Springer, online.Google ScholarGoogle Scholar
  30. Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In International Conference on Machine Learning (ICML). PMLR, Stockholm, SWEDEN, 4393–4402. https://proceedings.mlr.press/v80/ruff18a.htmlGoogle ScholarGoogle Scholar
  31. Manos Schinas, Symeon Papadopoulos, Georgios Petkos, Yiannis Kompatsiaris, and Pericles A Mitkas. 2015. Multimodal graph-based event detection and summarization in social media streams. In Proceedings of the 23rd ACM international conference on Multimedia. ACM, Brisbane, Australia, 189–192. https://doi.org/10.1145/2733373.2809933Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Bernhard Schölkopf, John C Platt, John Shawe-Taylor, Alex J Smola, and Robert C Williamson. 2001. Estimating the support of a high-dimensional distribution. Neural computation 13, 7 (2001), 1443–1471. https://doi.org/10.1162/089976601750264965Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. David Martinus Johannes Tax. 2001. One-class classification: Concept learning in the absence of counter-examples. Ph. D. Dissertation. Technische Universiteit Delft. http://homepage.tudelft.nl/n9d04/thesis.pdfGoogle ScholarGoogle Scholar
  34. Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008), 2579–2605. http://jmlr.org/papers/v9/vandermaaten08a.htmlGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017), 1–12. https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdfGoogle ScholarGoogle Scholar
  36. Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In International Conference on Learning Representations. OpenReview, Vancouver, BC, Canada, 1–12. https://openreview.net/forum?id=rJXMpikCZGoogle ScholarGoogle Scholar
  37. Xiao Wang, Deyu Bo, Chuan Shi, Shaohua Fan, Yanfang Ye, and S Yu Philip. 2022. A survey on heterogeneous graph embedding: methods, techniques, applications and sources. IEEE Transactions on Big Data 9 (2022), 415 – 436. https://doi.org/10.1109/TBDATA.2022.3177455Google ScholarGoogle ScholarCross RefCross Ref
  38. Xuhong Wang, Baihong Jin, Ying Du, Ping Cui, Yingshui Tan, and Yupu Yang. 2021. One-class graph neural networks for anomaly detection in attributed networks. Neural computing and applications 33, 18 (2021), 12073–12085. https://doi.org/10.1007/s00521-021-05924-9Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24. https://doi.org/10.1109/TNNLS.2020.2978386Google ScholarGoogle ScholarCross RefCross Ref
  40. Feng Xia, Ke Sun, Shuo Yu, Abdul Aziz, Liangtian Wan, Shirui Pan, and Huan Liu. 2021. Graph learning: A survey. IEEE Transactions on Artificial Intelligence 2, 2 (2021), 109–127. https://doi.org/10.1109/TAI.2021.3076021Google ScholarGoogle ScholarCross RefCross Ref
  41. Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations. OpenReview, New Orleans, 1–17. https://openreview.net/forum?id=ryGs6iA5KmGoogle ScholarGoogle Scholar
  42. Dengyong Zhou and Bernhard Schölkopf. 2004. A regularization framework for learning from graph data. In ICML 2004 Workshop on Statistical Relational Learning and Its Connections to Other Fields (SRL 2004). MPG Pure, Alberta, Canada, 132–137. https://www.microsoft.com/en-us/research/publication/regularization-framework-learning-graph-data/Google ScholarGoogle Scholar
  43. Hanzhang Zhou and Kezhi Mao. 2022. Document-Level Event Argument Extraction by Leveraging Redundant Information and Closed Boundary Loss. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. ACL, Seattle, Washington, 3041–3052. https://doi.org/10.18653/v1/2022.naacl-main.222Google ScholarGoogle ScholarCross RefCross Ref
  44. Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. 2020. Graph neural networks: A review of methods and applications. AI Open 1 (2020), 57–81. https://doi.org/10.1016/j.aiopen.2021.01.001Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. On the Use of Early Fusion Operators on Heterogeneous Graph Neural Networks for One-Class Learning

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Other conferences
        WebMedia '23: Proceedings of the 29th Brazilian Symposium on Multimedia and the Web
        October 2023
        285 pages
        ISBN:9798400709081
        DOI:10.1145/3617023

        Copyright © 2023 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 23 October 2023

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed limited

        Acceptance Rates

        Overall Acceptance Rate270of873submissions,31%
      • Article Metrics

        • Downloads (Last 12 months)35
        • Downloads (Last 6 weeks)7

        Other Metrics

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format .

      View HTML Format