Self-supervised learning for fully unsupervised re-identification in real-world applications

  • Gabriel Bertocco UNICAMP
  • Fernanda Andaló UNICAMP
  • Anderson Rocha UNICAMP

Resumo


Re-Identification (ReID) is vital for real-world applications such as AI-powered security, event understanding, and smart city development. It aims to retrieve all instances of a given person or object across a network of non-overlapping cameras, based solely on visual appearance. It is challenging due to occlusions, viewpoint changes, and background similarities. Supervised methods perform well but rely on costly, biased annotations, limiting scalability. To address this, we propose self-supervised learning algorithms for Unsupervised ReID, extendable to other modalities, like Text Authorship Verification, marked by high intra-class variation and low inter-class distinction. Our work introduces three fully unsupervised ReID methods: one using camera labels, one without side information, and one scalable to larger datasets. We also present a fourth hybrid method for long-range recognition under distortions. These solutions enhance generalization and enable real-world applications in forensics and biometrics. We have released open-source code and demonstrated practical impact, including a consultancy project for the Public Prosecutor’s Office of the State of São Paulo (MPSP), and House of Representatives (Brazilian Federal Chamber and Senate). This research was recognized by the Brazilian Computing Society (SBC) as the best Ph.D. thesis defended in Brazil in 2024.

Referências

K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick, “Momentum contrast for unsupervised visual representation learning,” in Conf. Comput. Vis. Pattern Recog., 2020, pp. 9729–9738.

M. Caron, H. Touvron, I. Misra, H. Jégou, J. Mairal, P. Bojanowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” arXiv preprint, vol. arXiv:2104.14294, 2021.

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in Conf. Comput. Vis. Pattern Recog., 2009, pp. 248–255.

L. Zheng, L. Shen, L. Tian, S. Wang, J. Wang, and Q. Tian, “Scalable person re-identification: A benchmark,” in Int. Conf. Comput. Vis., 2015, pp. 1116–1124.

G. C. Bertocco, A. Theophilo, F. Andaló, and A. De Rezende Rocha, “Leveraging ensembles and self-supervised learning for fully-unsupervised person re-identification and text authorship attribution,” IEEE Transactions on Information Forensics and Security, vol. 18, pp. 3876–3890, 2023.

Y. Zou, X. Yang, Z. Yu, B. Kumar, and J. Kautz, “Joint disentangling and adaptation for cross-domain person re-identification,” arXiv preprint, vol. arXiv:2007.10315, 2020.

Y. Lin, Y. Wu, C. Yan, M. Xu, and Y. Yang, “Unsupervised person re-identification via cross-camera similarity exploration,” IEEE Trans. Image Process., vol. 29, pp. 5481–5490, 2020.

L. Qi, L. Wang, J. Huo, L. Zhou, Y. Shi, and Y. Gao, “A novel unsupervised camera-aware domain adaptation framework for person re-identification,” in Int. Conf. Comput. Vis., 2019, pp. 8080–8089.

J. Wu, Y. Yang, H. Liu, S. Liao, Z. Lei, and S. Z. Li, “Unsupervised graph association for person re-identification,” in Int. Conf. Comput. Vis., 2019, pp. 8321–8330.

H. Tang, Y. Zhao, and H. Lu, “Unsupervised person re-identification with iterative self-supervised domain adaptation,” in Conf. Comput. Vis. Pattern Recog. Workshops, 2019, pp. 1536–1543.

Y. Fu, Y. Wei, G. Wang, Y. Zhou, H. Shi, and T. S. Huang, “Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification,” in Int. Conf. Comput. Vis., 2019, pp. 6112–6121.

D. Wang and S. Zhang, “Unsupervised person re-identification via multi-label classification,” in Conf. Comput. Vis. Pattern Recog., 2020, pp. 10 981–10 990.

Y. Ge, D. Chen, and H. Li, “Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification,” arXiv preprint, vol. arXiv:2001.01526, 2020.

S. Xuan and S. Zhang, “Intra-inter camera similarity for unsupervised person re-identification,” in Conf. Comput. Vis. Pattern Recog., 2021, pp. 11 926–11 935.

H. Chen, B. Lagadec, and F. Bremond, “ICE: Inter-instance contrastive encoding for unsupervised person re-identification,” in Int. Conf. Comput. Vis., 2021, pp. 14 960–14 969.

Z. Wang, J. Zhang, L. Zheng, Y. Liu, Y. Sun, Y. Li, and S. Wang, “CycAs: Self-supervised cycle association for learning re-identifiable descriptions,” in Eur. Conf. Comput. Vis., 2020, pp. 72–88.

H. Chen, B. Lagadec, and F. Bremond, “Enhancing diversity in teacher-student networks via asymmetric branches for unsupervised person re-identification,” in Winter Conf. Appl. Comput. Vis., 2020, pp. 1–10.

Y. Ge, D. Chen, F. Zhu, R. Zhao, and H. Li, “Self-paced contrastive learning with hybrid memory for domain adaptive object re-id,” arXiv preprint, vol. arXiv:2006.02713, 2020.

X. Zhang, Y. Ge, Y. Qiao, and H. Li, “Refining pseudo labels with clustering consensus over generations for unsupervised object re-identification,” in Conf. Comput. Vis. Pattern Recog., 2021, pp. 3436–3445.

Y. Cho, W. J. Kim, S. Hong, and S.-E. Yoon, “Part-based pseudo label refinement for unsupervised person re-identification,” in Conf. Comput. Vis. Pattern Recog., 2022, pp. 7308–7318.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint, vol. arXiv:1810.04805, 2018.

D. Q. Nguyen, T. Vu, and A. T. Nguyen, “BERTweet: A pre-trained language model for english tweets,” arXiv preprint, vol. arXiv:2005.10200, 2020.

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning with a unified text-to-text transformer,” arXiv preprint, vol. arXiv:1910.10683, 2019.

B. Boenninghoff, S. Hessler, D. Kolossa, and R. M. Nickel, “Explainable authorship verification in social media via attention-based similarity learning,” in Int. Conf. Big Data, 2019, pp. 36–45.

N. Potha and E. Stamatatos, “A profile-based method for authorship verification,” in Hellenic Conf. AI, 2014, pp. 313–326.

B. Boenninghoff, R. M. Nickel, S. Zeiler, and D. Kolossa, “Similarity learning for authorship verification in social media,” in IEEE Int. Conf. on Acoust., Speech Signal Process., 2019, pp. 2457–2461.

J. Revaud, M. Douze, C. Schmid, and H. Jégou, “Event retrieval in large video collections with circulant temporal encoding,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2013, pp. 2459–2466.

E. Ristani, F. Solera, R. Zou, R. Cucchiara, and C. Tomasi, “Performance measures and a data set for multi-target, multi-camera tracking,” in Eur. Conf. Comput. Vis., 2016, pp. 17–35.

L. Wei, S. Zhang, W. Gao, and Q. Tian, “Person transfer GAN to bridge domain gap for person re-identification,” in Conf. Comput. Vis. Pattern Recog., 2018, pp. 79–88.

P. Xu and X. Zhu, “Deepchange: A long-term person re-identification benchmark with clothes change,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023, pp. 11 196–11 205.

X. Liu, W. Liu, H. Ma, and H. Fu, “Large-scale vehicle re-identification in urban surveillance videos,” in IEEE Int. Conf. Multimedia Expo, 2016, pp. 1–6.

H. Liu, Y. Tian, Y. Yang, L. Pang, and T. Huang, “Deep relative distance learning: Tell the difference between similar vehicles,” in Conf. Comput. Vis. Pattern Recog., 2016, pp. 2167–2175.

Y. Lou, Y. Bai, J. Liu, S. Wang, and L. Duan, “Veri-wild: A large dataset and a new method for vehicle re-identification in the wild,” in Conf. Comput. Vis. Pattern Recog., 2019, pp. 3235–3243.

A. Theophilo, R. Giot, and A. Rocha, “Authorship attribution of social media messages,” IEEE Trans. Comput. Social Syst., 2021.

W. Liu, S. Nie, J. Yin, R. Wang, D. Gao, and L. Jin, “Sskd: Self-supervised knowledge distillation for cross domain adaptive person re-identification,” in 2021 7th IEEE International Conference on Network Intelligence and Digital Content (IC-NIDC). IEEE, 2021, pp. 81–85.

M. Li, C.-G. Li, and J. Guo, “Cluster-guided asymmetric contrastive learning for unsupervised person re-identification,” IEEE Trans. Image Process., vol. 31, pp. 3606–3617, 2022.

X. Zhang, D. Li, Z. Wang, J. Wang, E. Ding, J. Q. Shi, Z. Zhang, and J. Wang, “Implicit sample extension for unsupervised person re-identification,” in Conf. Comput. Vis. Pattern Recog., 2022, pp. 7369–7378.

J. Peng, G. Jiang, and H. Wang, “Adaptive memorization with group labels for unsupervised person re-identification,” IEEE Trans. Circuits Syst. Video Technol., pp. 1–1, 2023.

Z. Dai, G. Wang, W. Yuan, S. Zhu, and P. Tan, “Cluster contrast for unsupervised person re-identification,” in Asian Conf. Comput. Vis., 2022, pp. 1142–1160.

Z. Hu, C. Zhu, and G. He, “Hard-sample guided hybrid contrast learning for unsupervised person re-identification,” in IEEE Int. Conf. Netw. Intell. Digit. Content, 2021, pp. 91–95.

H. Zhang, G. Zhang, Y. Chen, and Y. Zheng, “Global relation-aware contrast learning for unsupervised person re-identification,” IEEE Trans. Circuits Syst. Video Technol., vol. 32, no. 12, pp. 8599–8610, 2022.
Publicado
30/09/2025
BERTOCCO, Gabriel; ANDALÓ, Fernanda; ROCHA, Anderson. Self-supervised learning for fully unsupervised re-identification in real-world applications. In: WORKSHOP DE TESES E DISSERTAÇÕES - CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI), 38. , 2025, Salvador/BA. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 15-21.