Subventions et des contributions :
Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)
One of the biggest challenges we face today is the need to handle large amounts of information. There is an urgent need to analyze data which is massive, highly interconnected and evolving with the goal of obtaining knowledge that aids us in making important decisions in application domains such as business growth, public administration, health, defense and the environment. Graphs are a natural choice to represent highly interconnected data. This research proposal addresses challenges in three core areas of large graph analytics, data credibility, diversity, and privacy .
The first topic we study is credible graph-analytics for social networks. Online social networks are efficient as a medium to spread information to millions of people in a short amount of time. Although the ease of information propagation through the network can be beneficial, spread of misinformation at such large scale can cause panic and have a disruptive effect. In order to ensure the credibility of the information received, it is important to design algorithms to detect, and limit the spread of misinformation. Our work in data credibility will design scalable algorithms for combating spread of misinformation.
The second topic we investigate is diversity-based graph-analytics for product recommendation and social influence. Many problems in recommender systems, that assign items to users, can be modeled as graph matching problems. However, these problems are too weak to model diversity in recommendation lists, an important metric aimed at user satisfaction. Our work in data diversity will consider new ways to generalize known graph matching models in order to address diversity needs. We will investigate how to capture diversity in social networks, with the goal of discovering more influential communities in those networks.
The third topic we explore is privacy-aware graph-analytics. While the need for mining data such as e-health records has been widely recognized, ensuring that privacy needs are met before releasing the data is important. Anonymization methods achieve privacy by perturbing the data minimally while differential privacy only publishes a statistical summary using noise addition to ensure usefulness and privacy simultaneously. Our work in data privacy will tackle important open problems related to anonymization methods and differential privacy using effective methods from statistical theory and practice, namely, copulas.
The benefit of this innovative research program will be three-fold: (1) The results we prove will contribute to the body of top quality academic knowledge on each of these topics. (2) It will train HQP for positions that require highly desired skills. (3) The fast, scalable algorithms we develop in this research program will bridge the gap between theory and practice and will be highly relevant to Canadian businesses and other organizations in the public and private sector.