Subventions et des contributions :

Retour à la page de recherche

Titre :

Multi-Agent Reinforcement Learning for Autonoumous Vehicles

Numéro de l’entente :

RGPIN

Valeur d'entente :

125 000,00 $

Date d'entente :

10 mai 2017 -

Organisation :

Conseil de recherches en sciences naturelles et en génie du Canada

Location :

Ontario, Autre, CA

Numéro de référence :

GC-2017-Q1-03223

Type d'entente :

subvention

Type de rapport :

Subventions et des contributions

Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :

Schwartz, Howard (Carleton University)

Programme :

Programme de subventions à la découverte - individuelles

But du programme :

The long term objective of this research is to create a system of machines and devices that can effectively learn how to work together in a changing environment. We are proposing such systems for the multi-robot application. The idea is to have many unmanned vehicles and sensors working together and learning how to adapt to their environment. Applications include the security of industrial facilities and border regions and for teams of autonomous vehicles that can secure territory without endangering human life. In these cases combinations of vision systems and various types of unmanned vehicles will learn how to work together to secure the region and address any dangers. The interaction of artificial intelligence with actions of multiple vehicles and devices will be a huge leap forward in the development of artificial intelligence. This work is specifically important for those in the security and defence industries, the robotics industry and for those working on the development of unmanned aerial vehicles (drones) and self-driving cars. The industrial impact of this work will be far reaching.
This research will focus on the adaptation and learning aspects of multi-robot systems. We will investigate the pursuer evader game and the guarding a territory game. The unique aspect of this research is to develop learning algorithms such that the robots have the ability to learn how to play these games. We propose to develop learning algorithms so that teams of robots can learn how to play together and how to compete. In one case, a number of robots will be defined as guards commanded to guard against invasion by another set of robots. The goal is for the guarding set of robots to intercept the invading robots as far as possible from the “target region” and for the invading robots to get as close as possible to the “target region”. In the second case the robots will learn how to play the evader pursuer game. The objective of the learning algorithms is to find the “optimal” strategy for all the players. The robots will learn how to take into consideration their own capabilities and the capabilities of the other robots as well.
We have made significant progress in developing learning algorithms for the case of one evader and one pursuer and the case of one guard and one invader, each having constant speed. Experimental work shows how our algorithms will adapt to real time situations and to take advantage of another robot’s poor performance. We then progressed to the case of multiple pursuers and multiple guards chasing and defending against a higher speed invader and evader.
We have developed an experimental facility that uses up to three mobile robots working together. Furthermore, we are collaborating with researchers at the Royal Military College in Kingston and we have access to their experimental mobile robots as well.