Mice Tracking Using The YOLO Algorithm

Richardson Santiago Teles de Menezes; John Victor Alves  Luiz; Aron Miranda Henrique-Alves; Rossana Moreno Santa Cruz; Helton   Maia

doi:10.5753/semish.2020.11326

Richardson Santiago Teles de Menezes UFRN
John Victor Alves Luiz UFRN
Aron Miranda Henrique-Alves UFRN
Rossana Moreno Santa Cruz IFPB
Helton Maia UFRN

DOI: https://doi.org/10.5753/semish.2020.11326

Resumo

The computational tool developed in this study is based on convolutional neural networks and the You Only Look Once (YOLO) algorithm for detecting and tracking mice in videos recorded during behavioral neuroscience experiments. We analyzed a set of data composed of 13622 images, made up of behavioral videos of three important researches in this area. The training set used 50% of the images, 25% for validation, and 25% for the tests. The results show that the mean Average Precision (mAP) reached by the developed system was 90.79% and 90.75% for the Full and Tiny versions of YOLO, respectively. Considering the high accuracy of the results, the developed work allows the experimentalists to perform mice tracking in a reliable and non-evasive way.

Palavras-chave: convolutional neural networks, YOLO algorithm, neuroscience experiments

Referências

Aitken, P., Zheng, Y., and Smith, P. F. (2017). EthovisionTM analysis of open field be- haviour in rats following bilateral vestibular loss. Journal of Vestibular Research, 27(2-3):89–101.

Burgos-Artizzu, X. P., Dolla ́r, P., Lin, D., Anderson, D. J., and Perona, P. (2012). Social behavior recognition in continuous video. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on, pages 1322–1329. IEEE.

Cichy, R. M., Khosla, A., Pantazis, D., Torralba, A., and Oliva, A. (2016). Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific reports, 6:27755.

Ciresan, D. C., Meier, U., Masci, J., Gambardella, L. M., and Schmidhuber, J. (2011). Flexible, high performance convolutional neural networks for image classification.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei, L. (2009). ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.

Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., and Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639):115.

Everingham, M., Eslami, S. M. A., Van Gool, L., Williams, C. K. I., Winn, J., and Zisser- man, A. (2015). The pascal visual object classes challenge: A retrospective. Interna- tional Journal of Computer Vision, 111(1):98–136.

Feichtenhofer, C., Pinz, A., and Zisserman, A. (2016). Convolutional two-stream net- work fusion for video action recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1933–1941.

Frasch, M. G., Lobmaier, S., Stampalija, T., Desplats, P., Pallare ́s, M. E., Pastor, V., Brocco, M., Wu, H.-t., Schulkin, J., Herry, C., et al. (2017). Non-invasive biomarkers of fetal brain development reflecting prenatal stress: an integrative multi-scale multi- species perspective on data collection and analysis. arXiv preprint arXiv:1801.00257.

Henriques-Alves, A. M. and Queiroz, C. M. (2016). Ethological evaluation of the ef- fects of social defeat stress in mice: beyond the social interaction ratio. Frontiers in behavioral neuroscience, 9:364.

Jhuang, H., Garrote, E., Yu, X., Khilnani, V., Poggio, T., Steele, A. D., and Serre, T. (2010). Automated home-cage behavioural phenotyping of mice. Nature communica- tions, 1:68.

Kretschmer, F., Kretschmer, V., Ko ̈pcke, L., Helmer, A., and Kretzberg, J. (2012). Au- tomated determinination of head gaze in rodents. In Image and Signal Processing (CISP), 2012 5th International Congress on, pages 1209–1213. IEEE.

Lecun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998). Gradient-based learning applied to document recognition. In Proceedings of the IEEE, pages 2278–2324.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A. C. (2016). Ssd: Single shot multibox detector. In European conference on computer vision, pages 21–37. Springer.

Mathis, A., Mamidanna, P., Cury, K. M., Abe, T., Murthy, V. N., Mathis, M. W., and Bethge, M. (2018). Deeplabcut: markerless pose estimation of user-defined body parts with deep learning. Technical report, Nature Publishing Group.

Menezes, R. S. T. d., de Azevedo Lima, L., Santana, O., Henriques-Alves, A. M., Santa Cruz, R. M., and Maia, H. (2018). Classification of mice head orientation using support vector machine and histogram of oriented gradients features. In 2018 Interna- tional Joint Conference on Neural Networks (IJCNN), pages 1–6. IEEE.

Rajpurkar, P., Hannun, A. Y., Haghpanahi, M., Bourn, C., and Ng, A. Y. (2017). Cardiologist-level arrhythmia detection with convolutional neural networks. arXiv preprint arXiv:1707.01836.

Rasti, P., Uiboupin, T., Escalera, S., and Anbarjafari, G. (2016). Convolutional neural net- work super resolution for face recognition in surveillance monitoring. In International conference on articulated motion and deformable objects, pages 175–184. Springer.

Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016). You only look once: Uni- fied, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788.

Redmon, J. and Farhadi, A. (2016). Yolo9000: better, faster, stronger (2016). arXiv preprint arXiv:1612.08242, 394.

Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv.

Romero-Ferrero, F., Bergomi, M. G., Hinz, R., Heras, F. J., and de Polavieja, G. G. (2018). idtracker. ai: Tracking all individuals in large collectives of unmarked animals. arXiv preprint arXiv:1803.04351.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3):211–252.

Scherer, D., Mu ̈ller, A., and Behnke, S. (2010). Evaluation of pooling operations in con- volutional architectures for object recognition. In Artificial Neural Networks–ICANN 2010, pages 92–101. Springer.

Unger, J., Mansour, M., Kopaczka, M., Gronloh, N., Spehr, M., and Merhof, D. (2017). An unsupervised learning approach for tracking mice in an enclosed area. BMC bioin- formatics, 18(1):272.

Vu, M.-A. T., Adali, T., Ba, D., Buzsaki, G., Carlson, D., Heller, K., Liston, C., Rudin, C., Sohal, V., Widge, A. S., et al. (2018). A shared vision for machine learning in neuroscience. Journal of Neuroscience, pages 0508–17.

Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2016). Learning deep features for discriminative localization. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).