Which Pretext Tasks Matter? A Comparative Study of Self-Supervised ECG-Based Emotion Recognition

  • Kevin Gustavo Montero Quispe UFAM
  • Daniel Mitsuaki da Silva Utyiama UFAM
  • Eduardo James Pereira Souto UFAM

Resumo


A necessidade de grandes volumes de dados rotulados limita a escalabilidade de sistemas automáticos de reconhecimento de emoções baseados em biossinais. Este estudo propõe uma abordagem multitarefa de aprendizado auto-supervisionado (SSL), aplicada a sinais de eletrocardiograma, visando reduzir a dependência de anotações manuais extensivas. Uma rede convolucional foi pré-treinada para discriminar simultaneamente seis transformações sintéticas, sendo posteriormente ajustada via fine-tuning supervisionado com frações reduzidas (5%, 25%, 50%) de rótulos em quatro bases de dados públicas. Os resultados mostram que a combinação de quatro ou cinco tarefas auxiliares é suficiente para gerar representações eficazes, com ganhos de desempenho de até 19 pontos percentuais na classificação de valência, excitação e estresse. Com apenas 25 % de rótulos, o modelo atinge desempenho próximo ao máximo, evidenciando a viabilidade da proposta em cenários com rotulagem limitada.

Referências

Chowdhury, A., Rosenthal, J., Waring, J. and Umeton, R. (2021). Applying Self-Supervised Learning to Medicine: Review of the State of the Art and Medical Implementations. Informatics, vol. 8, no. 3, p. 59.

Fang, A., Pan, F., Yu, W., Yang, L. and He, P. (2024). ECG-Based Emotion Recognition Using Random Convolutional Kernel Method. Biomedical Signal Processing and Control, vol. 86, art. 105907.

Kan, H., Yu, J., Huang, J., Liu, Z. and Zhou, H. (2023). Self-supervised Group Meiosis Contrastive Learning for EEG-Based Emotion Recognition. Applied Intelligence, vol. 53, pp. 27207–27225.

Katsigiannis, S., Ramzan, N. (2017). DREAMER: A database for emotion recognition through EEG and ECG signals from wireless low-cost off-the-shelf devices. IEEE journal of biomedical and health informatics, v. 22, n. 1, p. 98-107.

Khattak, A. M., Asghar, M. Z., Ali, M. and Batool, U. (2021). An Efficient Deep Learning Technique for Facial Emotion Recognition. Multimedia Tools and Applications, vol. 81, no. 2, pp. 1649-1683.

Koldijk, S., Sappelli, M., Verberne, S., Neerincx, M. A., Kraaij, W. (2014). The swell knowledge work dataset for stress and user modeling research. Proceedings of the 16th international conference on multimodal interaction. p. 291-298.

Kreibig, S. D. (2010). Autonomic Nervous System Activity in Emotion: A Review. Biological Psychology, vol. 84, no. 3, pp. 394-421.

Mehari, T., Strodthoff, N. (2021). Self-supervised representation learning from 12 lead ECG data. arXiv preprint arXiv:2103.12676.

Miranda-Correa, Juan A, Abadi, M. K., Sebe, N., Patras, I. (2018). Amigos: A dataset for affect, personality and mood research on individuals and groups. IEEE transactions on affective computing, v. 12, n. 2, p. 479-493.

Montesinos, V., Dell’Agnola, F., Arza, A., Aminifar, A., Atienza, D. (2019). Multi-Modal Acute Stress Recognition Using Off-the-Shelf Wearable Devices. Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), pp. 2196–2201.

Sarkar, P., Etemad, A. (2020). Self-Supervised ECG Representation Learning for Emotion Recognition. IEEE Transactions on Affective Computing, vol. 13, no. 3, pp. 1541-1554, DOI: 10.1109TAFFC.2020.3014842.

Schmidt, P., Reiss, A., Durichen R., Laerhoven K. (2018). Labelling affective states "in the wild": Practical guidelines and lessons learned. Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, pp. 654–659.

Vazquez-Rodriguez, J., Lefebvre, G., Cumin, J., Crowley, J.(2022). Transformer-Based SelfSupervised Learning for Emotion Recognition. 26th International Conference on Pattern Recognition (ICPR), pp. 2605-2612, DOI: 10.1109/ICPR56361.2022.9956027.

Wan, B., Guo, J. (2020). Learning Immersion Assessment Model Based on Multidimensional Physiological Characteristics. Proceedings of 2020 IEEE International Conference on Power, Intelligent Computing and Systems (ICPICS), pp. 87–90, DOI: 10.1109/ICPICS50287.2020.9202208.

Wang, Z. and Wang, Y. (2025). Emotion Recognition Based on Multimodal Physiological Electrical Signals. Frontiers in Neuroscience, vol. 19, art. 1512799.

Wang, X., Ma, Y., Cammon J., Fang, F., Gao, Y., Zhang Y. (2023). Self-supervised EEG Emotion Recognition Models Based on CNN. IEEE Transactions on Neural System and Rehabilitation Engineering, vol 21, pp. 1952-1962, DOI: 10.1109/TNSRE.2023.3263570.

Wang H., Chen T., Song L. (2024). Cascaded Self-supervised Learning for Subject-independent EEG-based Emotion Recognition. arXiv preprint arXiv:2403.04041.

Wu, Y., Daoudi M. (2023). Transformer-based Self-supervised Multimodal Representation Learning for Wearable Emotion Recognition. IEEE Transaction on Affective Computing, vol 15, no. 1, pp. 157-172, DOI: 10.1109/TAFFC.2023.3263907.

Yang, W., Rifqi, M., Marsala C., Pinna A. (2018). Physiological-Based Emotion Detection and Recognition in a Video Game Context. Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 1–8, DOI: 10.1109/IJCNN.2018.8489125.

Zeng, Z., Pantic, M., Roisman, G. I. and Huang, T. S. (2009). A Survey of Affect Recognition Methods: Audio, Visual, and Spontaneous Expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 39-58.
Publicado
29/09/2025
QUISPE, Kevin Gustavo Montero; UTYIAMA, Daniel Mitsuaki da Silva; SOUTO, Eduardo James Pereira. Which Pretext Tasks Matter? A Comparative Study of Self-Supervised ECG-Based Emotion Recognition. In: ENCONTRO NACIONAL DE INTELIGÊNCIA ARTIFICIAL E COMPUTACIONAL (ENIAC), 22. , 2025, Fortaleza/CE. Anais [...]. Porto Alegre: Sociedade Brasileira de Computação, 2025 . p. 439-450. ISSN 2763-9061. DOI: https://doi.org/10.5753/eniac.2025.13623.