Subventions et des contributions :

Titre :

Algorithms and Tools for Big Data Analysis and Automated Real Time Optimal or Near Optimal Decision Making for Industrial Systems

Numéro de l’entente :

RGPIN

Valeur d'entente :

140 000,00 $

Date d'entente :

10 mai 2017 -

Organisation :

Conseil de recherches en sciences naturelles et en génie du Canada

Location :

Québec, Autre, CA

Numéro de référence :

GC-2017-Q1-02870

Type d'entente :

subvention

Type de rapport :

Subventions et des contributions

Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :

Yacout, Soumaya (École Polytechnique de Montréal)

Programme :

Programme de subventions à la découverte - individuelles

But du programme :

Data science and data engineering have arguably become some of the most important research fields in this century. These fields are based on fundamental branches of science and engineering, namely, information technology, sensor technology, statistics, operations research, optimization, artificial intelligence, data mining and machine learning.
Along with human centric applications, some of these techniques are now being recommended by researchers in machine centric applications in which data is manufactured by machines and decisions are also made by machines based on ‘Machine to Machine (M2M)’ learning.
Presently, an important research question is how to exploit the available Big Data sets since, by definition, they consist of large volumes of data, acquired at high velocity, and in a variety of forms. Traditional data-processing and analysis techniques become inadequate.
The objective of this proposal is to develop algorithms and tools that are designed specifically to analyze and to extract knowledge from Big Data that are obtained from industrial systems. The extracted knowledge should lead to an understanding of how various components of a complex system influence each other and interact with their environment, and how an accurate prediction of the degradation can be obtained in a parallel computing framework.
The proposed methodology is based on an approach called Logical Analysis of Data (LAD), which is a data mining, machine learning approach that is based on Boolean logical reasoning. It extracts knowledge in the form of patterns that distinguish and characterize sets of data, and that identify some phenomena of interest. Different LAD’ s algorithms that are used to extract patterns in supervised and unsupervised learning will be considered in parallel computing frameworks; namely, enumeration techniques, mixed integer linear programming, and metaheuristics algorithms, mainly genetic algorithms, and ant colonies. The two parallel frameworks that will be used are Hadoop MapReduce and Spark; both are available in an open source environment, thus they are available to the public.
We intend to present to the scientific community scaled up algorithms in an open source environment. As such, every interested individual can use them, improve upon them and add to them. The impact of this research is the possibility of learning, finding, understanding physical complex phenomena that are not fully understood yet, and the exploitation of this knowledge in decision making. Depending on the specific applications in which these algorithms will be used, this knowledge can lead to an increase in safety and security, energy savings, protection of the environment, and increased efficiency in consuming natural resources. It will also lead to intelligent systems that can make the right decision at the right moment. Eventually, this will lead to self-sustaining and sustainable systems.