Subventions et des contributions :

Retour à la page de recherche

Titre :

Learning good representations for and with reinforcement learning

Numéro de l’entente :

RGPIN

Valeur d'entente :

355 000,00 $

Date d'entente :

10 mai 2017 -

Organisation :

Conseil de recherches en sciences naturelles et en génie du Canada

Location :

Québec, Autre, CA

Numéro de référence :

GC-2017-Q1-03493

Type d'entente :

subvention

Type de rapport :

Subventions et des contributions

Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :

Precup, Doina (Université McGill)

Programme :

Programme de subventions à la découverte - individuelles

But du programme :

Artificial intelligence (AI) has made great progress in isolating different aspects of intelligence and proposing flexible representations and powerful algorithms that lead to competence in specific tasks. For example, AI agents are better than humans at playing games like Go, a feat once considered impossible. However, the sort of flexible, robust, and autonomous competence routinely exhibited by humans, or even animals, remains elusive. The best AI systems are still tuned to specific problems. Our main research goal is to develop general AI methodology that relies, at its core, on reinforcement learning. Reinforcement learning is an approach to learning from interaction with an environment, inspired by animal learning theory. This proposal aims to design algorithms that can automatically create representations for reinforcement learning agents which allow them to model the world and to act at multiple time scales. We aim to provide new optimization criteria which describe formally what is a good set of abstract representations, provide gradient-based learning algorithms to learn such models, and demonstrate their effectiveness through empirical evaluations in simulated domains, game playing, as well as real time series prediction data sets. We will tackle the crucial problem of exploration, by explaining how an agent should move about its environment in order to optimize its learning speed. Finally, we will leverage these methods inside other algorithms that can benefit from multiple time scales, such as the training of deep, recurrent neural networks.