A methodology for detection and localization of fruits in apples orchards from aerial images
ResumoComputer vision methods based on convolutional neural networks (CNNs) have presented promising results on image-based fruit detection at ground-level for different crops. However, the integration of the detections found in different images, allowing accurate fruit counting and yield prediction, have received less attention. This work presents a methodology for automated fruit counting employing aerial-images. It includes algorithms based on multiple view geometry to perform fruits tracking, not just avoiding double counting but also locating the fruits in the 3-D space. Preliminary assessments show correlations above 0.8 between fruit counting and true yield for apples. The annotated dataset employed on CNN training is publicly available.
Fischler, M. A. and Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395. DOI: https://doi.org/10.1145/358669.358692
Hartley, R. and Zisserman, A. (2003). Multiple View Geometry in Computer Vision. Cambridge University Press, New York, NY, USA, 2 edition. DOI: https://doi.org/10.1017/CBO9780511811685
Hartley, R. I. and Sturm, P. (1997). Triangulation. Computer vision and image understanding, 68(2):146–157. DOI: https://doi.org/10.1006/cviu.1997.0547
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778. DOI: https://doi.org/10.1109/CVPR.2016.90
Häni, N., Roy, P., and Isler, V. (2020). A comparative study of fruit detection and counting methods for yield mapping in apple orchards. Journal of Field Robotics, 37(2):263–282. DOI: https://doi.org/10.1002/rob.21902
LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521(7553):436–444. DOI: https://doi.org/10.1038/nature14539
Liu, X., Chen, S. W., Liu, C., Shivakumar, S. S., Das, J., Taylor, C. J., Underwood, J., and Kumar, V. (2019). Monocular Camera Based Fruit Counting and Mapping With Semantic Data Association. IEEE Robotics and Automation Letters, 4(3):2296–2303. DOI: https://doi.org/10.1109/LRA.2019.2901987
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al. (2019). PyTorch: An imperative style, highperformance deep learning library. In Advances in neural information processing systems, pages 8026–8037.
Ren, S., He, K., Girshick, R., and Sun, J. (2017). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1137–1149. DOI: https://doi.org/10.1109/TPAMI.2016.2577031
Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., and McCool, C. (2016). Deepfruits: A fruit detection system using deep neural networks. Sensors, 16(8). DOI: https://doi.org/10.3390/s16081222
Santos, T. T., de Souza, L. L., dos Santos, A. A., and Avila, S. (2020). Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association. Computers and Electronics in Agriculture, 170:105247. DOI: https://doi.org/10.1016/j.compag.2020.105247
Schönberger, J. L. and Frahm, J.-M. (2016). Structure-from-motion revisited. In Conference on Computer Vision and Pattern Recognition (CVPR). DOI: https://doi.org/10.1109/CVPR.2016.445