Batch Reinforcement Learning of Feasible Trajectories in a Ship Maneuvering Simulator

José Amendola; Eduardo A. Tannuri; Fabio G. Cozman; Anna H. Reali

doi:10.5753/eniac.2018.4422

José Amendola USP
Eduardo A. Tannuri USP
Fabio G. Cozman USP
Anna H. Reali USP

DOI: https://doi.org/10.5753/eniac.2018.4422

Resumo

Ship control in port channels is a challenging problem that has resisted automated solutions. In this paper we focus on reinforcement learning of control signals so as to steer ships in their maneuvers. The learning process uses fitted Q iteration together with a Ship Maneuvering Simulator. Domain knowledge is used to develop a compact state-space model; we show how this model and the learning process lead to ship maneuvering under difficult conditions.

Referências

Ahmed, Y. A. and Hasegawa, K. (2013). Implementation of automatic ship berthing using artificial neural network for free running experiment, volume 9. IFAC.

Berg, T. E. and Ringen, E. (2011). Validation of Shiphandling Simulation Models. I Volume 1: Offshore Technology; Polar and Arctic Sciences and Technology, pages 705–712. ASME.

Berlink, H., Helena, A., and Costa, R. (2015). Batch Reinforcement Learning for Smart Home Energy Management. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, (Ijcai):2561–2567.

Ernst, D., Geurts, P., and Wehenkel, L. (2005). Tree-Based Batch Mode Reinforcement Learning. Journal of Machine Learning Research, 6(1):503–556.

Fossen, T. I. (2011). Handbook of marine craft hydrodynamics and motion control. Wiley.

Goh, K. S. and Lim, A. (2000). Combining various algorithms to solve the ship berthing problem. Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI, 2000-Janua:370–373.

Hafner, R. and Riedmiller, M. (2011). Reinforcement learning in feedback control : Challenges and benchmarks from technical process control. Machine Learning, 84(1-2):137–169.

Kingma, D. P. and Ba, J. (2014). Adam: A Method for Stochastic Optimization. CoRR, abs/1412.6980.

Lacki, M. (2008). Reinforcement Learning in Ship Handling. TransNav, the International Journal on Marine Navigation and Safety of Sea Transportation, 2(2):157–160.

Lange, S., Gabel, T., and Riedmiller, M. (2012). Batch Reinforcement Learning. In Reinforcement Learning: State-of-the-Art, pages 45–73.

Laurinen, M. (2016). Remote and Autonomous Ships: The next steps. Available at: http://www.rolls-royce.com/˜/media/Files/R/Rolls-Royce/documents/customers/marine/ship-intel/aawa-whitepaper-210616.pdf. Technical report.

Maas, A. L., Hannun, A. Y., and Ng, A. Y. (2013). Rectifier Nonlinearities Improve Neural Network Acoustic Models. In International conference on machine learning.

Molland, A. and Turnock, S. (2007). Marine Rudders and Control Surfaces. Elsevier/Butterworth-Heinemann.

MUNIN (2016). Research in maritime autonomous systems project results and technology potentials. Available at: http://www.unmanned-ship.org/munin/wpcontent/uploads/2016/02/MUNIN-final-brochure.pdf. Technical report.

Ng, A. Y., Harada, D., and Russell, S. (1999). Policy invariance under reward transformations : Theory and application to reward shaping. Sixteenth International Conference on Machine Learning, 3:278–287.

Queiroz Filho, A. N., Zimbres, M., and Tannuri, E. A. (2014). Development and Validation of a Customizable DP System for a Full Bridge Real Time Simulator. In International Conference on Ocean, Offshore and Arctic Engineering - OMAE 2014, volume 1A, page V01AT01A047.

Rak, A. and Gierusz, W. (2012). Reinforcement learning in discrete and continuous domains applied to ship trajectory generation. Polish Maritime Research.

Randløv, J. and Alstrøm, P. (1998). Learning to Drive a Bicycle using Reinforcement Learning and Shaping. Proceedings of the International Conference on Machine Learning (ICML), pages 463–471.

Riedmiller, M. (2005). Neural fitted Q iteration - First experiences with a data efficient neural Reinforcement Learning method. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 3720 LNAI:317–328.

Stamenkovich, M. (1992). An application of artificial neural networks for autonomous ship navigation through a channel. In IEEE PLANS 92 Position Location and Navigation Symposium Record, pages 346–352, Monterey, CA, USA. IEEE.

Sutton, R. S., Barto, A. G., and Bach, F. (2018). Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press.

Tannuri, E. A., Rateiro, F., Fucatu, C. H., Ferreira, M. D., Masetti, I. Q., and Nishimoto, K. (2014). Modular Mathematical Model for a Low-Speed Maneuvering Simulator. In Proceedings of the 33th International Conference on Ocean, Offshore andArctic Engineering (OMAE2014), pages 1–10, San Franscisco, USA.

Tuyen, L. P., Layek, A., Vien, N. A., and Chung, T. (2017). Deep reinforcement learning algorithms for steering an underactuated ship. In IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems.