Subventions et des contributions :

Retour à la page de recherche

Titre :

Novel Algorithms to Approximate the Future Consequence of Sequential Decisions

Numéro de l’entente :

RGPIN

Valeur d'entente :

100 000,00 $

Date d'entente :

10 mai 2017 -

Organisation :

Conseil de recherches en sciences naturelles et en génie du Canada

Location :

Alberta, Autre, CA

Numéro de référence :

GC-2017-Q1-02287

Type d'entente :

subvention

Type de rapport :

Subventions et des contributions

Renseignements supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :

Sabouri Bagh Abbas, Alireza (University of Calgary)

Programme :

Programme de subventions à la découverte - individuelles

But du programme :

Many complex problems arising in business, health care, and transportation can be modelled as sequential decision making problems under uncertainty, meaning that a decision maker has to make decisions periodically while some random events unfold over time. For instance, an airline dynamically changes the fare for different flights over a network of cities without knowing the actual future demand, trying to maximize its revenue while managing the risk of unsold seats. These problems can be conveniently modelled in the form of dynamic programs, a method that finds the best decision by maximizing the sum of immediate reward and the expected future reward. Unfortunately, for many practical problems, the number of future scenarios that one should consider in order to calculate the expected future reward function is exponentially large, making exact calculation of this function intractable. In order to overcome this issue, approximate dynamic programming (ADP) methods have been developed to find an approximate optimal solution.

A cornerstone of many ADP algorithms is defining a set of basis functions (or an approximation architecture) for approximating the future consequence of present decisions (the expected future reward function). Currently, the choice of basis functions requires prior expert knowledge about the problem, and is usually considered as more of an art than a science. My research program aims to develop, study, and apply novel algorithms that automate generation of basis functions by efficiently selecting a subset of functions from a large pool of potential basis functions, and updating this set as more information becomes known about the problem. The benefit of such algorithms is twofold: first, it reduces the burden to come up with a well-informed set of basis functions that requires significant prior knowledge about the problem; second, since many potential candidates are considered for basis functions, it is expected that the quality of the approximation is improved. My short-term objective includes evaluating the performance of the proposed algorithms in a variety of application areas, such as perishable inventory management, patient scheduling, and revenue management.

ADP is a general method that is commonly used for solving many different problems in a variety of applications. As the quality of the policies generated by these algorithms is dependent on the quality of the basis functions chosen, it would be of great interest, both theoretically and practically, if the process of generating and selecting basis functions can be automated. Therefore, even a small improvement achieved by the findings of my research would have significant practical implications in multiple application areas.