An Ensemble Approach to Facial Deepfake Detection Using Self-Supervised Features
Resumo
Substantial efforts have been dedicated to developing methods for detecting deepfake content, especially with the creation of large and diverse datasets with both higher image quality and demographic features. In this scenario, CNN-based approaches showed good initial success, later improved by their combination with Vision Transformers. More recently, Foundation Models (FMs) have emerged, improving performance across many visual tasks, including deepfake detection, and combining self-supervised features generated by FMs with CNN-based classifiers has resulted in significant performance gains. However, taking advantage of multiple maps of self-supervised features is not as straightforward as just adding more channels to the classifier. Therefore, this work explores ensemble techniques to effectively utilize these diverse self-supervised feature maps for realistic facial deepfake detection. Our experiments indicate that combining the output results of different classifiers, each one utilizing a single map of self-supervised features, leads to significant performance improvements, and several committee approaches consistently outperform individual classifiers, demonstrating the potential of these methods in enhancing deepfake detection accuracy.
Referências
Roberto Amoroso, Davide Morelli, Marcella Cornia, Lorenzo Baraldi, Alberto Del Bimbo, and Rita Cucchiara. 2024. Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images. [link]
Ben Beaumont-Thomas. 2024. Taylor Swift deepfake pornography sparks renewed calls for US legislation. [link].
Nicolò Bonettini, Edoardo Daniele Cannas, Sara Mandelli, Luca Bondi, Paolo Bestagini, and Stefano Tubaro. 2021. Video Face Manipulation Detection Through Ensemble of CNNs. In 2020 25th International Conference on Pattern Recognition (ICPR). IEEE, Milan, Italy, 5012–5019. DOI: 10.1109/ICPR48806.2021.9412711
Preeti Chaudhary, Aditya Verma, Vinay Kukreja, and Rishabh Sharma. 2024. Integrating Deep Learning and Ensemble Methods for Robust Tomato Disease Detection: A Hybrid CNN-RF Model Analysis. In 2024 11th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). IEEE, Noida, India, 1–4. DOI: 10.1109/ICRITO61523.2024.10522213
François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. IEEE, Honolulu, HI, USA, 1251–1258. DOI: 10.1109/CVPR. 2017.195
Thipwimon Chompookham and OJIEL Surinta. 2021. Ensemble methods with deep convolutional neural networks for plant leaf recognition. ICIC Express Letters 15, 6 (2021), 553–565.
Davide Alessandro Coccomini, Nicola Messina, Claudio Gennaro, and Fabrizio Falchi. 2022. Combining EfficientNet and Vision Transformers for Video Deepfake Detection. In Image Analysis and Processing – ICIAP 2022, Stan Sclaroff, Cosimo Distante, Marco Leo, Giovanni M. Farinella, and Federico Tombari (Eds.). Springer International Publishing, Cham, 219–229. DOI: 10.1007/978-3-031-06433-3_19
Matthieu Courbariaux, Itay Hubara, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. [link]
Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, and Cristian Canton Ferrer. 2020. The DeepFake Detection Challenge Dataset. arXiv:2006.07397
Nikolaos Giatsoglou, Symeon Papadopoulos, and Ioannis Kompatsiaris. 2023. Investigation of ensemble methods for the detection of deepfake face manipulations. [link]
Bruno Rocha Gomes, Antonio J. G. Busson, José Boaro, and Sérgio Colcher. 2023. Realistic Facial Deep Fakes Detection Through Self-Supervised Features Generated by a Self-Distilled Vision Transformer. In Proceedings of the 29th Brazilian Symposium on Multimedia and the Web (WebMedia ’23). Association for Computing Machinery, New York, NY, USA, 177–183. DOI: 10.1145/3617023.3617047
Young-Jin Heo, Young-Ju Choi, Young-Woon Lee, and Byung-Gyu Kim. 2021. Deepfake detection scheme based on vision transformer and distillation. preprint 2104.01353 abs/2104.01353 (2021), 7 pages. DOI: 10.48550/2104.01353
Brittaney Kiefer. 2023. This Brand’s Social Experiment Uses AI to Expose the Dark Side of ’Sharenting’. [link].
Romeo Lanzino, Federico Fontana, Anxhelo Diko, Marco Raoul Marini, and Luigi Cinque. 2024. Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. IEEE, Seattle, WA, USA, 3771–3780.
Yuezun Li, Xin Yang, Pu Sun, Honggang Qi, and Siwei Lyu. 2020. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, Seattle, WA, USA, 3207–3216.
Sachin Mehta, Ezgi Mercan, Jamen Bartlett, Donald Weaver, Joann G. Elmore, and Linda Shapiro. 2018. Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images. In Medical Image Computing and Computer Assisted Intervention – MICCAI 2018: 21st International Conference, Granada, Spain, September 16-20, 2018, Proceedings, Part II (Granada, Spain). Springer-Verlag, Berlin, Heidelberg, 893–901.
Maxime Oquab, Timothée Darcet, Théo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin ElNouby, et al. 2024. Dinov2: Learning robust visual features without supervision. Transactions on Machine Learning Research Journal 1 (2024), 1–31. DOI: 10.48550/arxiv.2304.07193
Artem A Pokroy and Alexey D Egorov. 2021. EfficientNets for deepfake detection: Comparison of pretrained models. In 2021 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). IEEE, St. Petersburg, Moscow, Russia, 598–600. DOI: 10.1109/ElConRus51938.2021.9396092
Tal Reiss, Bar Cavia, and Yedid Hoshen. 2023. Detecting Deepfakes Without Seeing Any. ArXiv abs/2311.01458 (2023), 16 pages. [link]
Andreas Rossler, Davide Cozzolino, Luisa Verdoliva, Christian Riess, Justus Thies, and Matthias Nießner. 2019. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, Seoul, Korea (South), 1–11. DOI: 10.1109/ICCV.2019.00009
Rhianna Schmunk. 2024. Explicit fake images of Taylor Swift prove laws haven’t kept pace with tech, experts say. [link].
Laura Stroebel, Mark Llewellyn, Tricia Hartley, Tsui Shan Ip, and Mohiuddin Ahmed. 2023. A systematic literature review on the effectiveness of deepfake detection techniques. Journal of Cyber Security Technology 7, 2 (2023), 83–113. DOI: 10.1080/23742917.2023.2192888
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander A Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-first AAAI conference on artificial intelligence. AAAI Press, San Francisco, California, USA, 4278–4284. DOI: 10.48550/arXiv.1602.07261
Mingxing Tan and Quoc Le. 2019. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the 36th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 97), Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). PMLR, Long Beach, CA, USA, 6105–6114. [link]
Eric Tjon, Melody Moh, and Teng-Sheng Moh. 2021. Eff-YNet: A Dual Task Network for DeepFake Detection and Segmentation. In 2021 15th International Conference on Ubiquitous Information Management and Communication (IMCOM). IEEE, Seoul, Korea (South), 1–8. DOI: 10.1109/IMCOM51814.2021.9377373
Loc Trinh and Yan Liu. 2021. An Examination of Fairness of AI Models for Deepfake Detection. arXiv:2105.00558 [cs.CV]
Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, Jingjing Chen, Yu-Gang Jiang, and Ser-Nam Li. 2022. M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection. In Proceedings of the 2022 International Conference on Multimedia Retrieval (Newark, NJ, USA) (ICMR ’22). Association for Computing Machinery, New York, NY, USA, 615–623. DOI: 10.1145/3512527.3531415
Ying Xu, Philipp Terhörst, Kiran Raja, and Marius Pedersen. 2023. A Comprehensive Analysis of AI Biases in DeepFake Detection With Massively Annotated Databases. arXiv:2208.05845 [cs.CV]
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE signal processing letters 23, 10 (2016), 1499–1503. DOI: 10.1109/LSP.2016.2603342
Hanqing Zhao, Wenbo Zhou, Dongdong Chen, Weiming Zhang, and Nenghai Yu. 2022. Self-supervised Transformer for Deepfake Detection. arXiv:2203.01265 [cs.CV] [link]