CORDIS - EU research results
CORDIS

Artificial Intelligence methods for Underwater target Tracking

Periodic Reporting for period 2 - AIforUTracking (Artificial Intelligence methods for Underwater target Tracking)

Reporting period: 2022-10-01 to 2023-09-30

The Artificial Intelligence methods for Underwater target Tracking (AIforUTracking) project has brought to the scientific community new tools for underwater target tracking by Autonomous Underwater Vehicles (AUVs) using Reinforcement Learning (RL) techniques. This project was clearly at the forefront of research and has directly addressed some of the main challenges and needs of the last Marine Strategy Framework Directive of the European parliament, in particular establishing a framework for community action in the field of marine environmental policy. Specifically, it contributed to: (a) designing and developing optimization algorithms that leverage new RL approaches, such as Partially Observable Markov Decision Process (POMDP) and Multi-Agent Reinforcement Learning (MARL). These Artificial Intelligence (AI) tools have increased the autonomy of the AUVs while improving the accuracy of the estimated target position; and (b) demonstrating the effectiveness and application of the path optimization technique using POMPD and MARL methods by conducting real tests in the ocean, i.e. different targets have been tracked using a single AUV or multiple AUVs, as a proof-of-concept. These innovative technologies, together with Range-Only and Single-Beacon (ROSB) methods, are more competitive and offer greater autonomy than the traditional Long BaseLine (LBL) array-based methods.
In conclusion, AIforUTracking has demonstrated for the first time that RL and MARL algorithms can be used to improve the target localization and tracking of underwater targets using autonomous vehicles, and the results obtained have been published in top-ranked journals. The studies conducted within this project have demonstrated the capabilities of vehicle path-planning systems using machine-learning techniques. This opens a new line of research to improve the study of the ocean and its inhabitants.
During the outgoing phase (periodic report: FIRST), a deep RL environment to train an agent to localise an underwater target using range-only methods has been designed. This environment has been published in a GitHub repository to make it publicly available: https://github.com/imasmitja/RLforUTracking. Additionally, different deep RL algorithms have been implemented to test their performance in the designed environment. Specifically, the following algorithms have been developed:
•DDPG: Deep Deterministic Policy Gradient
•TD3: Twin Delayed DDPG
•SAC: Soft Actor-Critic

Additionally, different field test has been conducted during the reporting period:
•Tracking a static and moving target at Monterey Bay (California, USA): In this test, we used a Wave Glider to localise an underwater docking station and a LRAUV.
•Tracking a static target in the harbor of Sant Feliu (Catalonia, Spain): In this test, a Sparus II was able to localise a standalone acoustic modem deployed in the middle of the harbor. This test was of special importance because demonstrated that the algorithms (and strategy) are platform-free (i.e. they can be deployed in a variety of vehicles, from a glider to a conventional AUV).

The results have been published in a large number of meetings and conferences, delivering both oral and poster presentations. For example: the IEEE 18th International Conference on Automation Science and Engineering (CASE2022) which took place in Mexico City (Mexico), and virtually, between August 20th to August 24th of 2022; the Deep Learning Barcelona Symposium (DLBCN2021), on December 2022; and the Ocean Sciences Meeting (OSM2020), on February 2022 (virtually), to show the AIforUTracking objectives and first achievement. I have also attended the Machine Learning on Monterey Bay (MLonMB2021) workshop which was conducted at the University of California Santa Cruz (California, USA) on November 10th, 2021.

During the incoming phase, both the environments and the algorithms have been modified in order to apply multi-agent reinforcement learning (MARL) algorithms. With these modifications, different MARL algorithms were implemented, trained, and tested to coordinately localize and track an underwater target using acoustic localization techniques. Despite standard state-of-the-art algorithms, during this phase, a novel MARL algorithm has been developed, called TransfQMix. This new architecture uses transformers to update the QMix algorithm. This approach outperforms state-of-the-art algorithms in different scenarios such as Spread in Particle (from OpenAI) and StartCrafII. Both environments constitute well-known benchmarks within the community, and therefore, highlight the performance obtained with the TransfQMix algorithm.

Additionally, the RL algorithms and the field tests conducted at Monterey Bay were post-processed during the incoming phase and a manuscript was written and published in the top-class Science Robotics journal. This was a great milestone for the AIforUTracking project, where the main outputs and results were shared within the community. Additionally, the project was presented at Science Is Wonderful! event organized by the European Commission's Science Fair in Brussels in 2023, reaching almost 4000 kids over the course of two days.
Different deep RL and MARL algorithms have been implemented to solve the AUV’s path planning for range-only target tracking. These algorithms (DDPG, TD3, SAC, and TransfQMix) have been initially tested in the single-agent scenario but are also capable of solving multi-agent scenarios. These algorithms have been implemented, trained, and tested for the first time in a real marine application to solve the path optimization problem in range-only target tracking challenges. With the methodology proposed, state-of-the-art performance has been achieved in localising and tracking underwater targets with autonomous vehicles. Additionally, the policy learned outperformed traditional methods at the beginning of the tracking by means of finding better trajectories which yielded a better target estimation. The quality of the results has been demonstrated with different publications conducted in qualified international conferences where we have shown the results obtained. Additionally, a journal paper has been published to a top-ranking journal.

Therefore, the work carried out is at the frontend of innovation as it uses cutting-edge machine-learning techniques to advance oceanographic research with autonomous vehicles. The project aims to make these vehicles more autonomous to monitor the ocean. Therefore, it will also address issues related to climate change and apply greater policies to restore our oceans, which will benefit the whole of society.

The results of this project may have an impact on how we monitor different marine species. If we improve our knowledge of their movements and behaviours, we could apply greater conservative strategies, and therefore it could have a potential impact on policy making in the future. Additionally, potential users could be: on the one hand, machine learning researchers who want to improve and test their algorithms to tackle real marine challenges; on the other hand, scientists and specialists in movement ecology and restoration could be interested in applying the algorithms developed here to improve the tracking capabilities of current methods. The results have been presented at international conferences to spread the research conducted within the community.
Wave Glider utilized to test the developed reinforcement learning algorithms to track targets