Facial Expressions Classification with Ensembles of Convolutional Neural Networks and Smart Voting
Resumo
Facial Expression is a very important factor in the social interaction of human beings. And technologies that can automatically interpret and respond to stimuli of facial expressions already find a wide variety of applications, from antidepressant drug testing to fatigue analysis of drivers and pilots. In this context, the following work presents a model for Automatic Classification of Facial Expression using as a training base the dataset Challenges in Representation Learning (FER2013), characterized by examples of spontaneous facial expressions in uncontrolled environments. The presented method is composed by a Convolutional Neural Networks Ensemble architecture, using a non-trivial voting system, based on a smart model, Xtreme Gradient Boosting - XGBoost. As performance criteria for validation of the proposed model, were used K-fold and F1 Score Micro techniques to guarantee robustness and reliability of the results, which are competitive with state-of-the-art works.
Referências
[Al-Shabi et al. 2016] Al-Shabi, M., Cheah, W. P., and Connie, T. (2016). Facial expression recognition using a hybrid cnn-sift aggregator. arXiv preprint arXiv:1608.02833.
[Brink et al. 2017] Brink, H., Richards, J. W., and Fetherolf, M. (2017). Real-World Machine Learning. Manning Publications, Estados Unidos.
[Chen and Guestrin 2016] Chen, T. and Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’16, pages 785–794, New York, NY, USA. ACM.
[Chollet 2017] Chollet, F. (2017). Deep Learning with Python. Manning Publications, Shelter Island, New York, 1 edition.
[Ekman and Friesen 1971] Ekman, P. and Friesen, W. V. (1971). Constants across cultures in the face and emotion. Journal of Personality and Social Psychology, 17(2):124–129.
[Fasel and Luettin 2003] Fasel, B. and Luettin, J. (2003). Automatic facial expression analysis: a survey. Pattern Recognition, 36(1):259–275.
[Goodfellow et al. 2013] Goodfellow, I. J., Erhan, D., Carrier, P. L., Courville, A., Mirza, M., Hamner, B., Cukierski, W., Tang, Y., Thaler, D., Lee, D.-H., et al. (2013). Challenges in representation learning: A report on three machine learning contests. In International Conference on Neural Information Processing, pages 117–124. Springer.
[Kaggle 2013] Kaggle (2013). Challenges in representation learning: Facial expression recognition challenge.
[Khan et al. 2018] Khan, S., Rahmani, H., Shah, S. A. A., and Bennamoun, M. (2018). A Guide to Convolutional Neural Networks for Computer Vision. Morgan and Claypool.
[Kim et al. 2016] Kim, B.-K., Dong, S.-Y., Roh, J., min Kim, G., and Lee, S.-Y. (2016). Fusing aligned and non-aligned face information for automatic affect recognition in the wild: A deep learning approach. 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 1499–1508.
[Kubat 2015] Kubat, M. (2015). An Introduction to Machine Learning. Springer, Estados Unidos.
[LeCun et al. 2015] LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. nature, 521(7553):436.
[Lindeberg 2012] Lindeberg, T. (2012). Scale invariant feature transform.
[Pantic 2009] Pantic, M. (2009). Facial Expression Analysis, volume 6, pages 400–406.
[Pramerdorfer and Kampel 2016] Pramerdorfer, C. and Kampel, M. (2016). Facial expression recognition using convolutional neural networks: State of the art. CoRR, abs/1612.02903.
[Simonyan and Zisserman 2015] Simonyan, K. and Zisserman, A. (2015). Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations (ICLR 2015), San Diego, EUA.
[Tang 2013] Tang, Y. (2013). Deep learning using support vector machines. CoRR,abs/1306.0239.
[Taylor and Nitschke 2017] Taylor, L. and Nitschke, G. (2017). Improving deep learning using generic data augmentation. CoRR, abs/1708.06020.