Subventions et des contributions :

Titre :
Reliable and efficient real-time tools for collecting and analyzing large health datasets
Numéro de l’entente :
RGPIN
Valeur d'entente :
100 000,00 $
Date d'entente :
10 mai 2017 -
Organisation :
Conseil de recherches en sciences naturelles et en génie du Canada
Location :
Ontario, Autre, CA
Numéro de référence :
GC-2017-Q1-02613
Type d'entente :
subvention
Type de rapport :
Subventions et des contributions
Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :
Mago, Vijay (Lakehead University)
Programme :
Programme de subventions à la découverte - individuelles
But du programme :

Background
Research suggests that more and more people rely on online sources for health information including symptoms, treatments and general health-related advice. Moreover, the user behaviour of millions of users currently active on social media demonstrates an openness to share facts related to their current health status. Such data could be used to provide real-time tracking and prediction of the spread of disease and other health concerns, or provide vital information about the effectiveness of the public health awareness strategies of health agencies such as Health Canada, the Centre for Disease Control or the World Health Organization. However, our current understanding of online health data produced through social networks is limited in important ways: (a) existing databases are project specific and data gathering mechanisms are time-constrained; (b) existing health-tracking tools depend on single interfaces such as Facebook, Twitter or Instagram, and (c) there is a lack of capacity for real-time mapping of health care issues.
Specific Aims of Research Program
Data collection: We will design an infrastructure (software and hardware) to continuously collect data from the social media handles of health agencies and medical associations, storing data on network storage systems.
Social media strategy effectiveness: Data points created by health organizations to inform the public can be correlated to real effects in the general population. The influence of these organizations can be studied by observing the level of penetration of their media content and overall effort of their communication strategy as compared to the unfolding of public health events without following general users’ social media accounts. This requires a new approach of understanding the effectiveness of such campaigns by measuring the attractiveness of the social media content by the volume of data being shared among different types of users (medical/health organizations, national laboratories, etc.).
Validating the predictive models: Social media data has been used to predict various healthcare behavioural issues and infectious diseases. The major challenge in these predictive models is to define the ground-truth. One way is to use crowd-sourcing but this limits the evaluation to one particular problem or model. To overcome this shortcoming, the proposed research program will develop ground-truth communities algorithms using multiple social media datasets.
Real-time analysis of social media: The amount of data captured from various social media platforms could be daunting and requires large and scalable computational powers to incorporate variations in the volume of data streams. It will be necessary to use distributed computation, so the proposed research program aims to use and build new algorithms that can be ported onto a Hadoop cluster which is available through the High Performance Computing Lab.