Subventions et des contributions :
Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)
Data breaches and hacking can lead to the disclosure of important, sensitive, and confidential information, and can affect anyone, from celebrities to ordinary citizens. Health and clinical data breaches are especially problematic because they lead to reluctance to share existing data which may be instrumental in advancing healthcare and they may discourage individuals from participating in important clinical research. For example, disclosure of disease status for an HIV patient would be detrimental to his emotional, social and psychological well-being and could potentially demoralize him to participate in important public health research. Thus, data breaches have far-reaching negative consequences.
Healthcare agencies worldwide have installed multiple safety nets to reduce data breach opportunity for hackers and to safeguard the health records of thousands of patients. One such security measure is the use of a distributed data system where provincial and/or local healthcare agencies (also called nodes) securely collect and store data in situ rather than sharing confidential health information of participants with other provinces or a central agency. However, the use of a distributed data network with multiple nodes and an analytic center presents analytical challenges. Current approach for confidential distributed data mimic meta-analyses, but this is flawed because assessments of subgroups, rare exposures, dose-response relationship, etc. are not possible. Existing Statistical Disclosure Control techniques can be used for privacy-preserving analysis; however, these methods are not well developed for distributed healthcare data, in particular where discrete and time-to-event outcomes are the norm.
Suitable methods are lacking to analyze confidential distributed data. My DG research program focuses on innovative techniques to address challenges associated with analyses of such data. In particular, I propose data aggregation to preserve confidentiality of individual-level data and the use of aggregate data for statistical analysis so that combining of data will NOT be required to extract information. This novel framework will focus on non-normal disease outcomes prevalent in distributed data networks such as the Canadian Network for Observational Drug Effect Studies (CNODES).
Statistical analysis strategies for aggregate data are neither well developed nor well studied. The development of a complete framework for estimation and inference, based on aggregate-level data, will include non-regular asymptotic theory, computational aspects and tools for creating aggregates, including recommendations for aggregate or group size and the development of statistical software. My research will pave a new direction for utilizing the untapped potential of rich healthcare data and will revolutionize the way information is extracted from confidential distributed data.