Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Fundings & Projects
  • People
  • Statistics
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Indian Institute of Technology Madras
  3. Publication1
  4. On the performance of different Deep Reinforcement Learning based controllers for the path-following of a ship
 
  • Details
Options

On the performance of different Deep Reinforcement Learning based controllers for the path-following of a ship

Date Issued
15-10-2023
Author(s)
Sivaraj, Sivaraman
Dubey, Awanish
Suresh Rajendran 
Indian Institute of Technology, Madras
DOI
10.1016/j.oceaneng.2023.115607
Abstract
A set of continuous state-action space-based deep reinforcement learning algorithms are used for the path following of a ship in calm water and waves. The mathematical model of a KVLCC2 tanker represents the ship dynamics. The mathematical model includes the hull force, rudder force, propulsion force, and external wave forces. Look ahead distance-based guidance algorithm called Line of Sight (LOS) is used for computing the Cross Track Error (CTE) and Heading Error (HE). The reward function is designed based on HE and CTE. The created Environment is trained with four different Deep Reinforcement Learning (DRL) agents named Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradients (DDPG), Twin-Delayed Deep Deterministic Policy Gradients (TD3), and Soft-Actor Critic (SAC). Common Neural Network architecture is used for all four agents. Yaw rate, HE, and CTE serve as input to the Neural Network, and the rudder deflection rate (δ°) corresponds to the action space (output). Computation time, average cross-track error, and rudder actuation are computed and compared for path-following scenarios. DDPG performs better with a minimum average CTE for all the simulated cases. However, SAC demands minimum rudder control effort to achieve the tasks. Finally, the trained agents are validated using Hardware In-Loop (HIL) simulation.
Volume
286
Subjects
  • Deep Reinforcement Le...

  • Hardware In-Loop

  • Heading control

  • KVLCC2

  • Path following

  • Waves

Indian Institute of Technology Madras Knowledge Repository developed and maintained by the Library

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback