Handling Pedestrians in Self-Driving Cars using Image Tracking and Alternative Path Generation with Frenet Frames

Renan Sarcinelli; Vinicius B.  Cardoso; Pedro Azevedo; Claudine Badue; Thiago M.  Paixão; Rodrigo F.  Berriel; Thiago Oliveira-Santos; Rânik Guidolini; Alberto F.  de Souza

doi:10.5753/sibgrapi.2019.9821

Renan Sarcinelli Federal University of Espirito Santo
Vinicius B. Cardoso Federal University of Espirito Santo
Pedro Azevedo Federal University of Espirito Santo
Claudine Badue Federal University of Espirito Santo
Thiago M. Paixão Federal University of Espirito Santo
Rodrigo F. Berriel Federal University of Espirito Santo
Thiago Oliveira-Santos Federal University of Espirito Santo
Rânik Guidolini Federal University of Espirito Santo
Alberto F. de Souza Federal University of Espirito Santo

DOI: https://doi.org/10.5753/sibgrapi.2019.9821

Resumo

The development of intelligent autonomous cars is of great interest. A particular and challenging problem is to handle pedestrians, for example, crossing or walking along the road. Since pedestrians are one of the most fragile elements in traffic, a reliable pedestrian detection and handling system is mandatory. The current pedestrian handling system of our autonomous cars suffers from the limitation of the pure detection-based systems, i.e., it limits the autonomous car system to make decisions based only on the very present moment. This work improves the pedestrian handling systems by incorporating an object tracker with the aim of predicting the pedestrian's behavior. With this knowledge, the autonomous car can better decide the time to stop and to start moving, providing a more comfortable, efficient, and safer driving experience. The proposed method was augmented with a path generator, based on FrenÃ©t Frames, and incorporated to our self-driving car in order to enable a better decision making and to enable overtaking pedestrians. The behaviour of our self-driving car was evaluated in both simulated and real-world scenarios. Results showed the proposed system is safer and more efficient than the system without tracking functionality due to the early decision capability

Palavras-chave: Pedestrian tracking, Crosswalk, Convolutional neural networks, Deep learning, Self-driving car

Referências

Badue C, Guidolini R, Carneiro RV, Azevedo P, Cardoso VB, Forechi A, Jesus LFR, Berriel RF, Paixão TM, Mutz F, et al. Self-driving cars: a survey, arXiv:190104407; 2019.

C. Premebida, O. Ludwig, U. Nunes. Exploiting lidar-based features on pedestrian detection in urban scenarios
Proceedings of the 12th international IEEE conference on intelligent transportation systems, IEEE (2009), pp. 1-6

R. Benenson, M. Mathias, R. Timofte, L. Van Gool. Pedestrian detection at 100 frames per second. Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE (2012), pp. 2903-2910

R. Guidolini, L.G. Scart, L.F.R. Jesus, V.B. Cardoso, C. Badue, T. Oliveira-Santos. Handling pedestrians in crosswalks using deep neural networks in the IARA autonomous car. Proceedings of the international joint conference on neural networks (IJCNN), IEEE (2018), pp. 1-8

J. Li, X. Liang, S. Shen, T. Xu, J. Feng, S. Yan. Scale-aware fast r-CNN for pedestrian detection. IEEE Trans Multimed, 20 (4) (2017), pp. 985-996

H. Wang, B. Wang, B. Liu, X. Meng, G. Yang. Pedestrian recognition and tracking using 3D liDAR for autonomous vehicle. Robot Autonom Syst, 88 (2017), pp. 71-78

I. Baek, A. Davies, G. Yan, R.R. Rajkumar. Real-time detection, tracking, and classification of moving and stationary objects using multiple fisheye images. Proceedings of the ieee intelligent vehicles symposium (IV), IEEE (2018), pp. 447-452

L. Leal-Taixé, C. Canton-Ferrer, K. Schindler. Learning by tracking: Siamese CNN for robust target association
Proceedings of the IEEE conference on computer vision and pattern recognition workshops (2016), pp. 33-40

R. Henschel, L. Leal-Taixé, D. Cremers, B. Rosenhahn. Fusion of head and full-body detectors for multi-object tracking. Proceedings of the IEEE conference on computer vision and pattern recognition workshops (2018), pp. 1428-1437

H. Sheng, Y. Zhang, J. Chen, Z. Xiong, J. Zhang. Heterogeneous association graph fusion for target association in multiple object tracking. IEEE Trans Circuits Syst Video Technol (2018)

C. Long, A. Haizhou, Z. Zijie, S. Chong. Real-time multiple people tracking with deeply learned candidate selection and person re-identification. Proceedings of the ICME (2018)

M. Werling, J. Ziegler, S. Kammel, S. Thrun. Optimal trajectory generation for dynamic street scenarios in a Frenet frame. Proceedings of the IEEE international conference on robotics and automation, IEEE (2010), pp. 987-993

K. He, G. Gkioxari, P. Dollar, R. Girshick. Mask r-CNN. Proceedings of the IEEE international conference on computer vision (ICCV) (2017)

L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, H. Adam. Encoder-decoder with atrous separable convolution for semantic image segmentation. Proceedings of the European conference on computer vision (ECCV) (2018)

K. Fragkiadaki, S. Levine, P. Felsen, J. Malik. Recurrent network models for human dynamics. Proceedings of the IEEE international conference on computer vision (ICCV) (2015)

J. Butepage, M.J. Black, D. Kragic, H. Kjellstrom. Deep representation learning for human motion prediction and classification. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (2017)

J. Walker, K. Marino, A. Gupta, M. Hebert. The pose knows: Video forecasting by generating pose futures. Proceedings of the ieee international conference on computer vision (2017), pp. 3332-3341

J. Martinez, M.J. Black, J. Romero. On human motion prediction using recurrent neural networks. Proceedings of the ieee conference on computer vision and pattern recognition (2017), pp. 2891-2900

Z. Cao, T. Simon, S.-E. Wei, Y. Sheikh. Realtime multi-person 2D pose estimation using part affinity fields. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR) (2017)

G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tompson, C. Bregler, K. Murphy. Towards accurate multi-person pose estimation in the wild. Proceedings of the IEEE conference on computer vision and pattern recognition (2017), pp. 4903-4911

Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K. MOT16: a benchmark for multi-object tracking. arXiv: 1603.00831[cs].

X. Li, Z. Sun, D. Cao, Z. He, Q. Zhu. Real-time trajectory planning for autonomous urban driving: framework, algorithms, and verifications. IEEE/ASME Trans Mechatron, 21 (2) (2015), pp. 740-753

L.A. Rentería, J.M.P. Oria, V.M. Becerra, A.J. Avello, B.M. Al-Hadithi. Modeling, simulation, and control of pedestrian avoidance maneuver for an urban electric vehicle. Proceedings of the IEEE European modelling symposium (EMS), IEEE (2015), pp. 201-206

C. Fernández, R. Domínguez, D. Fernández-Llorca, J. Alonso, M.A. Sotelo. Autonomous navigation and obstacle avoidance of a micro-bus. Int J Adv Robot Syst, 10 (4) (2013), p. 212

Torc Robotics, Virgina US. How we share the road with pedestrians. https://torc.ai/how-we-share-the-road-with-pedestrians/; 2018.

Waymo, US. Waymo – waymo. https://waymo.com/; 2019.

F. Mutz, L.P. Veronese, T. Oliveira-Santos, E. de Aguiar, F.A.A. Cheein, A.F. De Souza. Large-scale mapping in complex field scenarios using an autonomous car. Expert Syst Appl, 46 (2016), pp. 439-462

S. Thrun, W. Burgard, D. Fox. Probabilistic robotics. MIT press (2005)

L. de Paula Veronese, J. Guivant, F.A.A. Cheein, T. Oliveira-Santos, F. Mutz, E. de Aguiar, C. Badue, A.F. De Souza
A light-weight yet accurate localization system for autonomous cars in large-scale and complex environments
Proceedings of the IEEE 19th international conference on intelligent transportation systems (ITSC), IEEE (2016), pp. 520-525

M. Berger, A. Forechi, A.F. De Souza, N.J. De Oliveira, L. Veronese, V. Neves, E. de Aguiar, C. Badue. Traffic sign recognition with wiSARD and VG-RAM weightless neural networks. J Netw Innov Comput, 1 (1) (2013), pp. 87-98

A.F. De Souza, C. Fontana, F. Mutz, T.A. de Oliveira, M. Berger, A. Forechi, J. de Oliveira Neto, E. de Aguiar, C. Badue. Traffic sign detection with VG-RAM weightless neural networks. Proceedings of the international joint conference on neural networks (IJCNN), IEEE (2013), pp. 1-9

R.F. Berriel, E. de Aguiar, A.F. De Souza, T. Oliveira-Santos. Ego-lane analysis system (ELAS): dataset and algorithms. Image Vis Comput, 68 (2017), pp. 64-75

V. Cardoso, J. Oliveira, T. Teixeira, C. Badue, F. Mutz, T. Oliveira-Santos, L. Veronese, A.F. De Souza. A model-predictive motion planner for the IARA autonomous car. Proceedings of the ieee international conference on robotics and automation (ICRA), IEEE (2017), pp. 225-230

R. Guidolini, C. Badue, M. Berger, L. de Paula Veronese, A.F. De Souza. A simple yet effective obstacle avoider for the IARA autonomous car. Proceedings of the IEEE 19th international conference on intelligent transportation systems (ITSC), IEEE (2016), pp. 1914-1919

R. Guidolini, A.F. De Souza, F. Mutz, C. Badue. Neural-based model predictive control for tackling steering delays of autonomous cars. Proceedings of the international joint conference on neural networks (IJCNN), IEEE (2017), pp. 4324-4331

J. Redmon, A. Farhadi. Yolo9000: better, faster, stronger. Proceedings of the ieee conference on computer vision and pattern recognition (2017), pp. 7263-7271

Redmon J, Farhadi A. YOLOv3: an incremental improvement, arXiv:180402767; 2018.

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei. ImageNet: a large-scale hierarchical image database
Proceedings of the IEEE conference on computer vision and pattern recognition, IEEE (2009), pp. 248-255

T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick. Microsoft COCO: common objects in context. Proceedings of the European conference on computer vision, Springer (2014), pp. 740-755

J. Dai, Y. Li, K. He, J. Sun. R-FCN: object detection via region-based fully convolutional networks. Proceedings of the advances in neural information processing systems (2016), pp. 379-387

L. Zhao, X. Li, Y. Zhuang, J. Wang. Deeply-learned part-aligned representations for person re-identification. Proceedings of the IEEE international conference on computer vision (2017), pp. 3219-3228

R. Hartley, A. Zisserman. Multiple view geometry in computer vision. Cambridge university press (2003)

R.F. Berriel, A.T. Lopes, A.F. De Souza, T. Oliveira-Santos. Deep learning-based large-scale automatic satellite crosswalk classification. IEEE Geosci Remote Sens Lett, 14 (9) (2017), pp. 1513-1517

R.F. Berriel, F.S. Rossi, A.F. de Souza, T. Oliveira-Santos. Automatic large-scale data acquisition via crowd sourcing for crosswalk classification: a deep learning approach. Comput Graph, 68 (2017), pp. 32-42