Subventions et des contributions :
Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)
The objective of this program is to allow the automatic construction of complex, interpretable models directly from data, automating analysis previously done by experts.
I will achieve this by further developing tools for automatically building structured deep generative models.
Such models assert that the data has a hidden underlying structure, observed indirectly through complex, noisy processes.
This will require novel inference, training and evaluation procedures, but will enable advances in a wide variety of machine learning applications, including automated chemical design, and model-based deep reinforcement learning.
Specifically, we propose to develop a new model class having arbitrarily complex latent variable structure, and deep nonlinear observation models. This research will extend recent lines of inquiry that combine the best aspects of probabilistic graphical models (PGMs) and neural networks. These new models combine the interpretability and tractability of structured graphical models (such as clustering, time-series, or topic models) with the flexibility of neural network-based likelihood functions. This combination can sidestep the rigid modeling assumptions made by PGMs such as Gaussianity. It also sidesteps limitations of neural networks, namely their relative uninterpretability, as well as the requirement of having very large datasets.
Such model classes have been previously proposed, but their usefulness was limited by slow inference methods. We outline a concrete plan for exploiting the structure of the models themselves to develop efficient inference techniques. These methods extend recently-developed variational autoencoders (VAEs) to produce structured variational autoencoders (SVAEs).
These inference methods have the potential to make automatic model-building frameworks practical. In particular, we plan to extend existing matrix decomposition grammars in two ways. First, to handle non-linear observation models connecting different types of latent structures to each other and to the data. Second, to be scalable enough to handle real-world datasets. This modeling framework already contains many standard machine learning and machine vision models as special cases. Providing it with flexible neural-net observation models would mean that it would also contain most deep learning models as special cases, as well as a host of as-yet-undeveloped new model classes. Using fast recognition networks to do inference would mean that inference in this large model class would be scalable.