Self-supervised learning for fully unsupervised re-identification in real-world applications
Abstract
Re-Identification (ReID) enables real-world applications such as AI-powered surveillance, criminal identification, event understanding, and smart city development. However, it remains challenging due to occlusions, viewpoint changes, and background similarities. Supervised methods perform well but rely on costly, biased annotations, limiting scalability. To address this, we propose self-supervised algorithms for Unsupervised ReID (U-ReID), extendable to modalities such as Text Authorship Verification, tackling high intra-class variation and low inter-class distinction. Our work introduces three fully unsupervised ReID methods: one using camera labels, one without side information, and one scalable to large datasets. We also present a fourth hybrid method for long-range recognition under distortions. These solutions enhance generalization and enable real-world applications in forensics and biometrics.References
Boenninghoff, B., Hessler, S., Kolossa, D., and Nickel, R. M. (2019a). Explainable authorship verification in social media via attention-based similarity learning. In Int. Conf. Big Data, pages 36–45.
Boenninghoff, B., Nickel, R. M., Zeiler, S., and Kolossa, D. (2019b). Similarity learning for authorship verification in social media. In IEEE Int. Conf. on Acoust., Speech Signal Process., pages 2457–2461.
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., and Joulin, A. (2021). Emerging properties in self-supervised vision transformers. arXiv preprint, arXiv:2104.14294.
Chen, H., Lagadec, B., and Bremond, F. (2020). Enhancing diversity in teacher-student networks via asymmetric branches for unsupervised person re-identification. In Winter Conf. Appl. Comput. Vis., pages 1–10.
Chen, H., Lagadec, B., and Bremond, F. (2021). ICE: Inter-instance contrastive encoding for unsupervised person re-identification. In Int. Conf. Comput. Vis., pages 14960–14969.
Cho, Y., Kim, W. J., Hong, S., and Yoon, S.-E. (2022). Part-based pseudo label refinement for unsupervised person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 7308–7318.
Dai, Z., Wang, G., Yuan, W., Zhu, S., and Tan, P. (2022). Cluster contrast for unsupervised person re-identification. In Asian Conf. Comput. Vis., pages 1142–1160.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Conf. Comput. Vis. Pattern Recog., pages 248–255.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint, arXiv:1810.04805.
Fu, Y., Wei, Y., Wang, G., Zhou, Y., Shi, H., and Huang, T. S. (2019). Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In Int. Conf. Comput. Vis., pages 6112–6121.
Ge, Y., Chen, D., and Li, H. (2020a). Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint, arXiv:2001.01526.
Ge, Y., Chen, D., Zhu, F., Zhao, R., and Li, H. (2020b). Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. arXiv preprint, arXiv:2006.02713.
He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In Conf. Comput. Vis. Pattern Recog., pages 9729–9738.
Hu, Z., Zhu, C., and He, G. (2021). Hard-sample guided hybrid contrast learning for unsupervised person re-identification. In IEEE Int. Conf. Netw. Intell. Digit. Content, pages 91–95.
Li, M., Li, C.-G., and Guo, J. (2022). Cluster-guided asymmetric contrastive learning for unsupervised person re-identification. IEEE Trans. Image Process., 31:3606–3617.
Li, Y.-J., Lin, C.-S., Lin, Y.-B., and Wang, Y.-C. F. (2019). Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In Int. Conf. Comput. Vis., pages 7919–7929.
Lin, Y., Wu, Y., Yan, C., Xu, M., and Yang, Y. (2020). Unsupervised person re-identification via cross-camera similarity exploration. IEEE Trans. Image Process., 29:5481–5490.
Liu, H., Tian, Y., Yang, Y., Pang, L., and Huang, T. (2016a). Deep relative distance learning: Tell the difference between similar vehicles. In Conf. Comput. Vis. Pattern Recog., pages 2167–2175.
Liu, W., Nie, S., Yin, J., Wang, R., Gao, D., and Jin, L. (2021). Sskd: Self-supervised knowledge distillation for cross domain adaptive person re-identification. In 2021 7th IEEE International Conference on Network Intelligence and Digital Content (ICNIDC), pages 81–85. IEEE.
Liu, X., Liu, W., Ma, H., and Fu, H. (2016b). Large-scale vehicle re-identification in urban surveillance videos. In IEEE Int. Conf. Multimedia Expo, pages 1–6.
Lou, Y., Bai, Y., Liu, J., Wang, S., and Duan, L. (2019). Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In Conf. Comput. Vis. Pattern Recog., pages 3235–3243.
Nguyen, D. Q., Vu, T., and Nguyen, A. T. (2020). BERTweet: A pre-trained language model for english tweets. arXiv preprint, arXiv:2005.10200.
Peng, J., Jiang, G., and Wang, H. (2023). Adaptive memorization with group labels for unsupervised person re-identification. IEEE Trans. Circuits Syst. Video Technol., pages 1–1.
Potha, N. and Stamatatos, E. (2014). A profile-based method for authorship verification. In Hellenic Conf. AI, pages 313–326.
Qi, L., Wang, L., Huo, J., Zhou, L., Shi, Y., and Gao, Y. (2019). A novel unsupervised camera-aware domain adaptation framework for person re-identification. In Int. Conf. Comput. Vis., pages 8080–8089.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2019). Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint, arXiv:1910.10683.
Ristani, E., Solera, F., Zou, R., Cucchiara, R., and Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In Eur. Conf. Comput. Vis., pages 17–35.
Tang, H., Zhao, Y., and Lu, H. (2019). Unsupervised person re-identification with iterative self-supervised domain adaptation. In Conf. Comput. Vis. Pattern Recog. Workshops, pages 1536–1543.
Theophilo, A., Giot, R., and Rocha, A. (2021). Authorship attribution of social media messages. IEEE Trans. Comput. Social Syst.
Wang, D. and Zhang, S. (2020). Unsupervised person re-identification via multi-label classification. In Conf. Comput. Vis. Pattern Recog., pages 10981–10990.
Wang, M., Lai, B., Huang, J., Gong, X., and Hua, X.-S. (2020a). Camera-aware proxies for unsupervised person re-identification. arXiv preprint, arXiv:2012.10674.
Wang, Z., Zhang, J., Zheng, L., Liu, Y., Sun, Y., Li, Y., and Wang, S. (2020b). CycAs: Self-supervised cycle association for learning re-identifiable descriptions. In Eur. Conf. Comput. Vis., pages 72–88.
Wei, L., Zhang, S., Gao, W., and Tian, Q. (2018). Person transfer GAN to bridge domain gap for person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 79–88.
Wu, J., Yang, Y., Liu, H., Liao, S., Lei, Z., and Li, S. Z. (2019). Unsupervised graph association for person re-identification. In Int. Conf. Comput. Vis., pages 8321–8330.
Xu, P. and Zhu, X. (2023). Deepchange: A long-term person re-identification benchmark with clothes change. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11196–11205.
Xuan, S. and Zhang, S. (2021). Intra-inter camera similarity for unsupervised person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 11926–11935.
Zeng, K., Ning, M., Wang, Y., and Guo, Y. (2020). Hierarchical clustering with hard-batch triplet loss for person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 13657–13665.
Zhang, H., Zhang, G., Chen, Y., and Zheng, Y. (2022a). Global relation-aware contrast learning for unsupervised person re-identification. IEEE Trans. Circuits Syst. Video Technol., 32(12):8599–8610.
Zhang, X., Ge, Y., Qiao, Y., and Li, H. (2021). Refining pseudo labels with clustering consensus over generations for unsupervised object re-identification. In Conf. Comput. Vis. Pattern Recog., pages 3436–3445.
Zhang, X., Li, D., Wang, Z., Wang, J., Ding, E., Shi, J. Q., Zhang, Z., and Wang, J. (2022b). Implicit sample extension for unsupervised person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 7369–7378.
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q. (2015). Scalable person re-identification: A benchmark. In Int. Conf. Comput. Vis., pages 1116–1124.
Zou, Y., Yang, X., Yu, Z., Kumar, B., and Kautz, J. (2020). Joint disentangling and adaptation for cross-domain person re-identification. arXiv preprint, arXiv:2007.10315.
Boenninghoff, B., Nickel, R. M., Zeiler, S., and Kolossa, D. (2019b). Similarity learning for authorship verification in social media. In IEEE Int. Conf. on Acoust., Speech Signal Process., pages 2457–2461.
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J., Bojanowski, P., and Joulin, A. (2021). Emerging properties in self-supervised vision transformers. arXiv preprint, arXiv:2104.14294.
Chen, H., Lagadec, B., and Bremond, F. (2020). Enhancing diversity in teacher-student networks via asymmetric branches for unsupervised person re-identification. In Winter Conf. Appl. Comput. Vis., pages 1–10.
Chen, H., Lagadec, B., and Bremond, F. (2021). ICE: Inter-instance contrastive encoding for unsupervised person re-identification. In Int. Conf. Comput. Vis., pages 14960–14969.
Cho, Y., Kim, W. J., Hong, S., and Yoon, S.-E. (2022). Part-based pseudo label refinement for unsupervised person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 7308–7318.
Dai, Z., Wang, G., Yuan, W., Zhu, S., and Tan, P. (2022). Cluster contrast for unsupervised person re-identification. In Asian Conf. Comput. Vis., pages 1142–1160.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). Imagenet: A large-scale hierarchical image database. In Conf. Comput. Vis. Pattern Recog., pages 248–255.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint, arXiv:1810.04805.
Fu, Y., Wei, Y., Wang, G., Zhou, Y., Shi, H., and Huang, T. S. (2019). Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In Int. Conf. Comput. Vis., pages 6112–6121.
Ge, Y., Chen, D., and Li, H. (2020a). Mutual mean-teaching: Pseudo label refinery for unsupervised domain adaptation on person re-identification. arXiv preprint, arXiv:2001.01526.
Ge, Y., Chen, D., Zhu, F., Zhao, R., and Li, H. (2020b). Self-paced contrastive learning with hybrid memory for domain adaptive object re-id. arXiv preprint, arXiv:2006.02713.
He, K., Fan, H., Wu, Y., Xie, S., and Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In Conf. Comput. Vis. Pattern Recog., pages 9729–9738.
Hu, Z., Zhu, C., and He, G. (2021). Hard-sample guided hybrid contrast learning for unsupervised person re-identification. In IEEE Int. Conf. Netw. Intell. Digit. Content, pages 91–95.
Li, M., Li, C.-G., and Guo, J. (2022). Cluster-guided asymmetric contrastive learning for unsupervised person re-identification. IEEE Trans. Image Process., 31:3606–3617.
Li, Y.-J., Lin, C.-S., Lin, Y.-B., and Wang, Y.-C. F. (2019). Cross-dataset person re-identification via unsupervised pose disentanglement and adaptation. In Int. Conf. Comput. Vis., pages 7919–7929.
Lin, Y., Wu, Y., Yan, C., Xu, M., and Yang, Y. (2020). Unsupervised person re-identification via cross-camera similarity exploration. IEEE Trans. Image Process., 29:5481–5490.
Liu, H., Tian, Y., Yang, Y., Pang, L., and Huang, T. (2016a). Deep relative distance learning: Tell the difference between similar vehicles. In Conf. Comput. Vis. Pattern Recog., pages 2167–2175.
Liu, W., Nie, S., Yin, J., Wang, R., Gao, D., and Jin, L. (2021). Sskd: Self-supervised knowledge distillation for cross domain adaptive person re-identification. In 2021 7th IEEE International Conference on Network Intelligence and Digital Content (ICNIDC), pages 81–85. IEEE.
Liu, X., Liu, W., Ma, H., and Fu, H. (2016b). Large-scale vehicle re-identification in urban surveillance videos. In IEEE Int. Conf. Multimedia Expo, pages 1–6.
Lou, Y., Bai, Y., Liu, J., Wang, S., and Duan, L. (2019). Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In Conf. Comput. Vis. Pattern Recog., pages 3235–3243.
Nguyen, D. Q., Vu, T., and Nguyen, A. T. (2020). BERTweet: A pre-trained language model for english tweets. arXiv preprint, arXiv:2005.10200.
Peng, J., Jiang, G., and Wang, H. (2023). Adaptive memorization with group labels for unsupervised person re-identification. IEEE Trans. Circuits Syst. Video Technol., pages 1–1.
Potha, N. and Stamatatos, E. (2014). A profile-based method for authorship verification. In Hellenic Conf. AI, pages 313–326.
Qi, L., Wang, L., Huo, J., Zhou, L., Shi, Y., and Gao, Y. (2019). A novel unsupervised camera-aware domain adaptation framework for person re-identification. In Int. Conf. Comput. Vis., pages 8080–8089.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., and Liu, P. J. (2019). Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint, arXiv:1910.10683.
Ristani, E., Solera, F., Zou, R., Cucchiara, R., and Tomasi, C. (2016). Performance measures and a data set for multi-target, multi-camera tracking. In Eur. Conf. Comput. Vis., pages 17–35.
Tang, H., Zhao, Y., and Lu, H. (2019). Unsupervised person re-identification with iterative self-supervised domain adaptation. In Conf. Comput. Vis. Pattern Recog. Workshops, pages 1536–1543.
Theophilo, A., Giot, R., and Rocha, A. (2021). Authorship attribution of social media messages. IEEE Trans. Comput. Social Syst.
Wang, D. and Zhang, S. (2020). Unsupervised person re-identification via multi-label classification. In Conf. Comput. Vis. Pattern Recog., pages 10981–10990.
Wang, M., Lai, B., Huang, J., Gong, X., and Hua, X.-S. (2020a). Camera-aware proxies for unsupervised person re-identification. arXiv preprint, arXiv:2012.10674.
Wang, Z., Zhang, J., Zheng, L., Liu, Y., Sun, Y., Li, Y., and Wang, S. (2020b). CycAs: Self-supervised cycle association for learning re-identifiable descriptions. In Eur. Conf. Comput. Vis., pages 72–88.
Wei, L., Zhang, S., Gao, W., and Tian, Q. (2018). Person transfer GAN to bridge domain gap for person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 79–88.
Wu, J., Yang, Y., Liu, H., Liao, S., Lei, Z., and Li, S. Z. (2019). Unsupervised graph association for person re-identification. In Int. Conf. Comput. Vis., pages 8321–8330.
Xu, P. and Zhu, X. (2023). Deepchange: A long-term person re-identification benchmark with clothes change. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11196–11205.
Xuan, S. and Zhang, S. (2021). Intra-inter camera similarity for unsupervised person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 11926–11935.
Zeng, K., Ning, M., Wang, Y., and Guo, Y. (2020). Hierarchical clustering with hard-batch triplet loss for person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 13657–13665.
Zhang, H., Zhang, G., Chen, Y., and Zheng, Y. (2022a). Global relation-aware contrast learning for unsupervised person re-identification. IEEE Trans. Circuits Syst. Video Technol., 32(12):8599–8610.
Zhang, X., Ge, Y., Qiao, Y., and Li, H. (2021). Refining pseudo labels with clustering consensus over generations for unsupervised object re-identification. In Conf. Comput. Vis. Pattern Recog., pages 3436–3445.
Zhang, X., Li, D., Wang, Z., Wang, J., Ding, E., Shi, J. Q., Zhang, Z., and Wang, J. (2022b). Implicit sample extension for unsupervised person re-identification. In Conf. Comput. Vis. Pattern Recog., pages 7369–7378.
Zheng, L., Shen, L., Tian, L., Wang, S., Wang, J., and Tian, Q. (2015). Scalable person re-identification: A benchmark. In Int. Conf. Comput. Vis., pages 1116–1124.
Zou, Y., Yang, X., Yu, Z., Kumar, B., and Kautz, J. (2020). Joint disentangling and adaptation for cross-domain person re-identification. arXiv preprint, arXiv:2007.10315.
Published
2025-07-20
How to Cite
BERTOCCO, Gabriel C.; ANDALÓ, Fernanda A.; ROCHA, Anderson.
Self-supervised learning for fully unsupervised re-identification in real-world applications. In: THESIS AND DISSERTATION CONTEST (CTD), 38. , 2025, Maceió/AL.
Anais [...].
Porto Alegre: Sociedade Brasileira de Computação,
2025
.
p. 65-74.
ISSN 2763-8820.
DOI: https://doi.org/10.5753/ctd.2025.8371.
