RetailNet: A deep learning approach for people counting and hot spots detection in retail stores
Customer behavior analysis is an essential issue for retailers, allowing for optimized store performance, enhanced customer experience, reduced operational costs, and consequently higher profitability. Nevertheless, not much attention has been given to computer vision approaches to automatically extract relevant information from images that could be of great value to retailers. In this paper, we present a low-cost deep learning approach to estimate the number of people in retail stores in real-time and to detect and visualize hot spots. For this purpose, only an inexpensive RGB camera, such as a surveillance camera, is required. To solve the people counting problem, we employ a supervised learning approach based on a Convolutional Neural Network (CNN) regression model. We also present a four channel image representation named RGBP image, composed of the conventional RGB image and an extra binary image P representing whether there is a visible person in each pixel of the image. To extract the latter information, we developed a foreground/background detection method that considers the peculiarities of people behavior in retail stores. The P image is also exploited to detect the hot spots of the store, which can later be visually analyzed. Several experiments were conducted to validate, evaluate and compare our approach using a dataset comprised of videos that were collected from a surveillance camera placed in a real shoe retail store. Results revealed that our approach is sufficiently robust to be used in real world situations and outperforms straightforward CNN approaches.
S. Lam, M. Vandenbosch, M. Pearce, "Retail sales force scheduling based on store traffic forecasting", Journal of Retailing, vol. no. 1, pp. 61-88, 1998.
Y.-k. Wu, H.-C. Wang, L.-C. Chang, S.-C. Chou, "Customer's flow analysis in physical retail store", Procedia Manufacturing, vol. 3, pp. 3506-352015.
A. J. Newman, G. R. Foxall, "In-store customer behaviour in the fashion sector: some emerging methodological and theoretical directions", International Journal of Retail & Distribution Management, vol. no. pp. 591-600, 2003.
J. Redmon, S. Divvala, R. Girshick, A. Farhadi, "You only look once: Unified real-time object detection", CVPR, June 2016.
Y. LeCun, Y. Bengio, G. Hinton, "Deep learning", Nature, vol. 5no. 75pp. 42015.
A. Schofield, P. Mehta, T. J. Stonham, "A system for counting people in video images using neural networks to identify the background scene", Pattern Recognition, vol. no. 8, pp. 1421-141996.
D. T. Nguyen, W. Li, P. O. Ogunbona, "Human detection from images and videos: A survey", Pattern Recognition, vol. pp. 148-12016.
G. Widmer, M. Kubat, "Learning in the presence of concept drift and hidden contexts", Machine learning, vol. no. 1, pp. 69-101, 1996.
C. C. Loy, K. Chen, S. Gong, T. Xiang, "Crowd counting and profiling: Methodology and evaluation", Modeling simulation and visual analysis of crowds. Springer, pp. 347-32013.
N. Dalal, B. Triggs, "Histograms of oriented gradients for human detection" in CVPR, IEEE Computer Society, vol. 1, pp. 886-893, 2005.
P. Sabzmeydani, G. Mori, "Detecting pedestrians by learning shapelet features", 2007 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2007.
P. F. Felzenszwalb, R. B. Girshick, D. McAllester, D. Ramanan, "Object detection with discriminatively trained part-based models", IEEE transactions on pattern analysis and machine intelligence, vol. no. 9, pp. 1627-162009.
G. J. Brostow, R. Cipolla, "Unsupervised bayesian detection of independent motion in crowds", CVPR, vol. 1, pp. 594-601, 2006.
V. Rabaud, S. Belongie, "Counting crowded moving objects", CVPR, vol. 1, pp. 705-72006.
A. C. Davies, Jia Hong Yin, S. A. Velastin, "Crowd monitoring using image processing", Electronics Communication Engineering Journal, vol. 7, no. 1, pp. 37-Feb 1995.
A. B. Chan, N. Vasconcelos, "Counting people with low-level features and bayesian regression", IEEE Transactions on Image Processing, vol. no. 4, pp. 2160-21April 2012.
D. Conte, P. Foggia, G. Percannella, M. Vento, "A method based on the indirect approach for counting people in crowded scenes", 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 111-1Aug 2010.
C. Zhang, H. Li, X. Wang, X. Yang, "Cross-scene crowd counting via deep convolutional neural networks", CVPR, June 2015.
Y. Zhang, D. Zhou, S. Chen, S. Gao, Y. Ma, "Single-image crowd counting via multi-column convolutional neural network", CVPR, 2016.
L. Boominathan, S. S. Kruthiventi, R. V. Babu, "Crowdnet: A deep convolutional network for dense crowd counting", Proceedings of the 24th ACM international conference on Multimedia, pp. 640-62016.
V. A. Sindagi, V. M. Patel, "A survey of recent advances in cnn-based single image crowd counting and density estimation", Pattern Recognition Letters, vol. 107, pp. 3-2018.
T. Bouwmans, "Traditional and recent approaches in background modeling for foreground detection: An overview", Computer science review, vol. pp. 31-2014.
D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, 2014.
F. Chollet et al., Keras, 20[online] Available: https.//keras.io.