Detecting faces in specific scenarios: Systematic Literature Review

Bruno Gonçalves Dias; Victor Soares Ivamoto; Clodoaldo Aparecido de Moraes Lima

doi:10.5753/eniac.2023.234273

Bruno Gonçalves Dias Universidade de São Paulo
Victor Soares Ivamoto Universidade de São Paulo
Clodoaldo Aparecido de Moraes Lima Universidade de São Paulo

DOI: https://doi.org/10.5753/eniac.2023.234273

Resumo

Facial detection is a base component for multiple applications in the fields of biometrics, surveillance, human-robot interaction and others. Although significant progress has been made in the field over the past decade, there are still gaps to be addressed, particularly in specific scenarios as the presence of partial occlusion, variations of lighting, pose, and scale among others. This work aims to provide a comprehensive evaluation of recent studies on facial detection in the wild through a systematic literature review. The review includes a focus on the use of scenario-specific information within the field. A total of forty-five papers were analyzed to provide an overview of the field, incorporating information on scenarios.

Palavras-chave: Face detection, Review, Scenarios, Convolutional Neural Network

Referências

Alafif, T., Hailat, Z., Aslan, M., and Chen, X. (2017). On detecting partially occluded faces with pose variations. In 2017 14th International Symposium on Pervasive Systems, Algorithms and Networks 2017 11th International Conference on Frontier of Computer Science and Technology 2017 Third International Symposium of Creative Computing (ISPAN-FCST-ISCC), pages 28–37.

Bai, Y. and Ghanem, B. (2017). Multi-scale fully convolutional network for face detection in the wild. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2078–2087.

Chai, Z., Zhang, Y., Du, Z., Wang, D., and Méndez-Vázquez, H. (2014). Learning flexible block based local binary patterns for unconstrained face detection. In 2014 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6.

Chen, Q., Shen, F., Ding, Y., Gong, P., Tao, Y., and Wang, J. (2018). Face detection using r-fcn based deformable convolutional networks. In 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pages 4165–4170.

Deng, J. and Xie, X. (2017a). Detect face in the wild using cnn cascade with feature aggregation at multi-resolution. In 2017 IEEE International Conference on Image Processing (ICIP), pages 4167–4171.

Deng, J. and Xie, X. (2017b). Nested shallow cnn-cascade for face detection in the wild. In 2017 12th IEEE International Conference on Automatic Face Gesture Recognition (FG 2017), pages 165–172.

El-Barkouky, A., Shalaby, A., Mahmoud, A., and Farag, A. (2014). Selective part models for detecting partially occluded faces in the wild. In 2014 IEEE International Conference on Image Processing (ICIP), pages 268–272.

Feng, Y., Yu, S., Peng, H., Li, Y.-R., and Zhang, J. (2022). Detect faces efficiently: A survey and evaluations. IEEE Transactions on Biometrics, Behavior, and Identity Science, 4(1):1–18.

Ge, S., Li, J., Ye, Q., and Luo, Z. (2017). Detecting masked faces in the wild with lle-cnns. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 426–434.

Gul, S. and Farooq, H. (2015). A machine learning approach to detect occluded faces in unconstrained crowd scene. In 2015 IEEE 14th International Conference on Cognitive Informatics Cognitive Computing (ICCI*CC), pages 149–155.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778.

Huang, G. B., Ramesh, M., Berg, T., and Learned-Miller, E. (2007). Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.

Jain, V. and Learned-Miller, E. (2010). Fddb: A benchmark for face detection in unconstrained settings. Technical Report UM-CS-2010-009, University of Massachusetts, Amherst.

Jiang, F., Zhang, J., Yan, L., Xia, Y., and Shan, S. (2018). A three-category face detector with contextual information on finding tiny faces. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 2680–2684.

Kitchenham, B. and Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering. 2.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Pereira, F., Burges, C., Bottou, L., and Weinberger, K., editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc.

Li, J., Karmoshi, S., and Zhu, M. (2017a). Unconstrained face detection based on cascaded convolutional neural networks in surveillance video. In 2017 2nd International Conference on Image, Vision and Computing (ICIVC), pages 46–52.

Li, J., Liu, L., Li, J., Feng, J., Yan, S., and Sim, T. (2019). Toward a comprehensive face detector in the wild. IEEE Transactions on Circuits and Systems for Video Technology, 29(1):104–114.

Li, J., Zhang, D., Zhang, K., Hu, K., and Yang, L. (2017b). Real-time face detection during the night. pages 582–586.

Li, X.-X., Liang, R., Gao, J., and Wang, H. (2015). Facial occlusion detection via structural error metrics and clustering. In He, X., Gao, X., Zhang, Y., Zhou, Z.-H., Liu, Z.-Y., Fu, B., Hu, F., and Zhang, Z., editors, Intelligence Science and Big Data Engineering. Image and Video Data Engineering, pages 118–127, Cham. Springer International Publishing.

Li, Z., Tang, X., Wu, X., Liu, J., and He, R. (2020). Progressively refined face detection through semantics-enriched representation learning. IEEE Transactions on Information Forensics and Security, 15:1394–1406.

Liao, S., Jain, A. K., and Li, S. Z. (2016). A fast and accurate unconstrained face detector. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2):211–223.

Lin, S., Cai, L., Lin, X., and Ji, R. (2016). Masked face detection via a modified lenet. Neurocomputing, 218:197–202.

Liu, Y. and Levine, M. D. (2017). Multi-path region-based convolutional neural network for accurate detection of unconstrained “hard faces”.

Lv, J.-J., Feng, Y.-J., Zhou, X.-D., and Zhou, X. (2016). Face detection using hierarchical fully convolutional networks. In Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., and Cheng, H., editors, Pattern Recognition, pages 268–277, Singapore. Springer Singapore.

Magalhaes, J. P., Ren, T. I., and Cavalcanti, G. D. C. (2012). Face detection under illumination variance using combined adaboost and gradientfaces. In Yin, H., Costa, J. A. F., and Barreto, G., editors, Intelligent Data Engineering and Automated Learning - IDEAL 2012, pages 435–442, Berlin, Heidelberg. Springer Berlin Heidelberg.

Martin Koestinger, Paul Wohlhart, P. M. R. and Bischof, H. (2011). Annotated Facial Landmarks in the Wild: A Large-scale, Real-world Database for Facial Landmark Localization. In Proc. First IEEE International Workshop on Benchmarking Facial Image Analysis Technologies.

Martinez, A. and Benavente, R. (1998). The AR Face Database: CVC Technical Report, 24.

Marčetić, D., Hrkać, T., and Ribarić, S. (2016). Two-stage cascade model for unconstrained face detection. In 2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE), pages 1–4.

Micheal, A. A. and Geetha, P. (2017). Multi-view face detection using normalized pixel difference feature. In 2017 International Conference on Communication and Signal Processing (ICCSP), pages 0988–0992.

Nanni, L., Brahnam, S., and Lumini, A. (2019). Face detection ensemble with methods using depth information to filter false positives. Sensors, 19(23).

Nguyen, D.-L., Nguyen, V.-T., Tran, M.-T., and Yoshitaka, A. (2015a). Adaptive WildNet Face network for detecting face in the wild. In Verikas, A., Radeva, P., and Nikolaev, D., editors, Eighth International Conference on Machine Vision (ICMV 2015), volume 9875, pages 152 – 156. International Society for Optics and Photonics, SPIE.

Nguyen, D.-L., Nguyen, V.-T., Tran, M.-T., and Yoshitaka, A. (2015b). Boosting speed and accuracy in deformable part models for face image in the wild. In 2015 International Conference on Advanced Computing and Applications (ACOMP), pages 134–141.

Ravidas, S. (2019). Deep learning for pose-invariant face detection in unconstrained environment. International Journal of Electrical and Computer Engineering (IJECE), 9:577.

Rowley, H., Baluja, S., and Kanade, T. (1998). Rotation invariant neural network-based face detection. In Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231), pages 38–44.

Sawat, D. D., Hegadi, R. S., Garg, L., and Hegadi, R. S. (2020). Pixel encoding for unconstrained face detection. Multimedia Tools Appl., 79(47–48):35033–35054.

Shi, X., Shan, S., Kan, M., Wu, S., and Chen, X. (2018). Real-time rotation-invariant face detection with progressive calibration networks.

Shu, H., Chen, D., Li, Y., and Wang, S. (2017). A highly accurate facial region network for unconstrained face detection. In 2017 IEEE International Conference on Image Processing (ICIP), pages 665–669.

Simonyan, K. and Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition.

Triantafyllidou, D., Nousi, P., and Tefas, A. (2018). Fast deep convolutional face detection in the wild exploiting hard sample mining. Big Data Research, 11:65–76. Selected papers from the 2nd INNS Conference on Big Data: Big Data Neural Networks.

http://mplab.ucsd.edu. The MPLab GENKI Database.

Venkatesan, R. and Li, B. (2018). Convolutional Neural Networks in Visual Computing: A Concise Guide. Convolutional Neural Networks in Visual Computing: A Concise Guide. CRC Press.

Viola, P. and Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, volume 1, pages I–I.

Wang, S., Xu, T., Li, W., and Sun, H. (2019). Cssd: Cascade single shot face detector. In 2019 International Joint Conference on Neural Networks (IJCNN), pages 1–8.

Yan, J., Zhang, X., Lei, Z., and Li, S. Z. (2013). Real-time high performance deformable model for face detection in the wild. In 2013 International Conference on Biometrics (ICB), pages 1–6.

Yan, J., Zhang, X., Lei, Z., and Li, S. Z. (2014). Face detection by structural models. Image and Vision Computing, 32(10):790–799. Best of Automatic Face and Gesture Recognition 2013.

Yang, S., Luo, P., Loy, C. C., and Tang, X. (2016). Wider face: A face detection bench-mark. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Yang, S., Wiliem, A., and Lovell, B. C. (2018). It takes two to tango: Cascading off-the-shelf face detectors. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 648–6488.

Zafeiriou, S., Zhang, C., and Zhang, Z. (2015). A survey on face detection in the wild: Past, present and future. Computer Vision and Image Understanding, 138:1–24.

Zakaria, Z., Suandi, S. A., and Mohamad-Saleh, J. (2018). Hierarchical skin-adaboost-neural network (h-skann) for multi-face detection. Applied Soft Computing, 68:172–190.

Zeng, D., Liu, H., Zhao, F., Ge, S., Shen, W., and Zhang, Z. (2019). Proposal pyramid networks for fast face detection. Information Sciences, 495:136–149.

Zhang, J., Wu, X., Hoi, S. C., and Zhu, J. (2020). Feature agglomeration networks for single stage face detection. Neurocomputing, 380:180–189.

Zhang, K., Guo, X., He, Y., Wang, X., Guo, Y., and Ding, Q. (2019a). IMS-SSH: multi-scale face detection method in unconstrained settings. Journal of Electronic Imaging, 28(1):1 – 10.

Zhang, K., Zhang, Z., Wang, H., Li, Z., Qiao, Y., and Liu, W. (2017a). Detecting faces using inside cascaded contextual cnn. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 3190–3198.

Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., and Li, S. Z. (2017b). Faceboxes: A cpu real-time face detector with high accuracy.

Zhang, Y., Ding, M., Bai, Y., and Ghanem, B. (2019b). Detecting small faces in the wild based on generative adversarial network and contextual information. Pattern Recognition, 94:74–86.

Zheng, Y., Zhu, C., Luu, K., Bhagavatula, C., Le, T. H. N., and Savvides, M. (2016). Towards a deep learning framework for unconstrained face detection. In 2016 IEEE 8th International Conference on Biometrics Theory, Applications and Systems (BTAS), pages 1–8.

Zhou, L., Zhao, H., and Leng, J. (2022). Mtcnet: Multi-task collaboration network for rotation-invariance face detection. Pattern Recognition, 124:108425.

Zhou, Z., He, Z., Jia, Y., Du, J., Wang, L., and Chen, Z. (2020). Context prior-based with residual learning for face detection: A deep convolutional encoder–decoder network. Signal Processing: Image Communication, 88:115948.

Zhu, C., Zheng, Y., Luu, K., and Savvides, M. (2018). Enhancing interior and exterior deep facial features for face detection in the wild. In 2018 13th IEEE International Conference on Automatic Face Gesture Recognition (FG 2018), pages 226–233.

Zhu, X. and Ramanan, D. (2012). Face detection, pose estimation, and landmark localization in the wild. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pages 2879–2886.