Subventions et des contributions :
Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)
Modern machine learning methods, such as boosting, support vector machines, or neural networks, have made great impact on statistical research and application mostly in terms of improved predictive and prognostic accuracy. Their enhanced abilities to model complex interactions and non-linear effects could also be utilized to explain the underlying physical or physiological phenomena and to generate specific scientific hypothesis for further study. In non-strictly predictive applications, use of many modern methods, however, is hampered by their black-box nature and by the lack of inferential tools that would allow to obtain statistical confidence measures on inferred relationships. The simplest statistical inference which is universal in classical models pertains to statements on individual covariates. For example, is covariate "Gender" an important factor in a model of disease progression? In classical models this is answered by calculating statistical inference quantities (p-values, confidence intervals) on a parameter (or small set of parameters) that are connected with "Gender" in a model. In contrast, machine learning methods utilize a non-parametric approach where covariates influence on the outcome is not controlled by a small set of parameters. Hence the classical approach is not applicable and an importance of any particular covariate in the model of the outcome is not easily tested. While many model-specific or approximate measures have been proposed, in particular Variable Importance Metric in a Random Forest model, there is no universal, statistically coherent approach present in literature. We propose to develop, validate, apply and disseminate - in the form of freely available software packages - a set of tools for classical inference that will allow researchers to test the importance and influence of covariates of interest in the non-parametric machine learning models of the outcome.