Balancing Known and Unknown in Open-Set Domain Adaptation via Attention

  • André Sacilotti USP
  • Jurandy Almeida USP

Resumo


Open-set unsupervised domain adaptation (OS-UDA) for video action recognition is a critical yet largely underexplored problem. Real-world applications must be robust to distribution shifts between training and deployment data (domain gap) and capable of identifying actions not seen during training (open-set). We introduce OS-DTAB, a lightweight, plug-and-play Vision-Transformer block designed to enhance existing OS-UDA frameworks. OS-DTAB replaces the standard clip-aggregation mechanism with a domain-transferable attention module. This component compels the model to focus on spatio-temporal cues that are simultaneously transferable and open-set aware. Our experiments on the HMDB↔UCF benchmark show that OS-DTAB sets a new state-of-the-art, achieving a Harmonic Open-Set (HOS) score that surpasses models with more robust backbones.

Referências

G. Zara, V. d. C. Turrisi, S. Roy, P. Rota, and E. Ricci, “Simplifying open-set video domain adaptation with contrastive learning,” Computer Vision and Image Understanding, vol. 241, p. 103953, 2024.

G. Zara, S. Roy, P. Rota, and E. Ricci, “Autolabel: Clip-based framework for open-set video domain adaptation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, pp. 11 504–11 513.

Z. Chen, Y. Luo, and M. Baktashmotlagh, “Conditional extreme value theory for open set video domain adaptation,” in Proceedings of the 3rd ACM International Conference on Multimedia in Asia (MMAsia), 2022.

A. Sacilotti, S. F. dos Santos, N. Sebe, and J. Almeida, “Transferable-guided attention is all you need for video domain adaptation,” in Proceedings of the Winter Conference on Applications of Computer Vision (WACV), 2025, pp. 8680–8690.

M.-H. Chen, Z. Kira, G. Alregib, J. Yoo, R. Chen, and J. Zheng, “Temporal attentive alignment for large-scale video domain adaptation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 6320–6329.

K. Saito, S. Yamamoto, Y. Ushiku, and T. Harada, “Open set domain adaptation by backpropagation,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 606–621.

K. Saito and K. Saenko, “Ovanet: One-vs-all network for universal domain adaptation,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 9000–9009.

L. F. A. e Silva, S. F. dos Santos, N. Sebe, and J. Almeida, “Beyond the known: Enhancing open set domain adaptation with unknown exploration,” Pattern Recognition Letters, vol. 189, pp. 265–272, 2025.

S. Bucci, F. C. Borlino, B. Caputo, and T. Tommasi, “Distance-based hyperspherical classification for multi-source open-set domain adaptation,” in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 1119–1128.

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning transferable visual models from natural language supervision,” in Proceedings of the 38th International Conference on Machine Learning (ICML), 2021, pp. 8748–8763.
Publicado
30/09/2025
SACILOTTI, André; ALMEIDA, Jurandy. Balancing Known and Unknown in Open-Set Domain Adaptation via Attention. In: WORKSHOP DE TRABALHOS DA GRADUAÇÃO - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 38. , 2025, Salvador/BA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 283-286.