Subventions et des contributions :

Titre :
Deduplication-aware Systems for Cost-efficient Cloud Storage
Numéro de l’entente :
DGDND
Valeur d'entente :
120 000,00 $
Date d'entente :
10 janv. 2018 -
Organisation :
Conseil de recherches en sciences naturelles et en génie du Canada
Location :
Québec, Autre, CA
Numéro de référence :
GC-2017-Q4-01091
Type d'entente :
subvention
Type de rapport :
Subventions et des contributions
Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier (2017-2018 à 2020-2021).

Nom légal du bénéficiaire :
Liu, Xue (Université McGill)
Programme :
Supplément aux subventions à la découverte MDN-CRSNG
But du programme :

Cloud storage systems serve as an important infrastructure for emerging applications including Big Data Analytics and Internet of Things (IoT). A key challenge is how to handle a massive amount of data in real-time in a cost-efficient way. Explosive growth in the volume and complexity of data exacerbates this challenge. Further, many cloud computing systems are networked and distributed, thus making storage system management more complex and costly due to the limited bandwidth. Data deduplication is an efficient data reduction approach that not only reduces storage space by eliminating duplicate data but also minimizes the transmission of redundant data even in low-bandwidth environments. However, conventional deduplication schemes suffer from high computation complexity in chunking and large storage overhead for storing block indices, thus failing to offer real-time and cost-efficient storage services. This research program aims to address the most important challenges facing the performance optimization of cloud storage systems targeting big data applications. We will conduct innovative research to overcome the current limitations of deduplication-based methodologies. To this end, we will investigate adaptive multi-granularity deduplication schemes to significantly reduce the amounts of data to be processed and improve the overall system performance. We plan to propose a new methodology in deduplication granularities to meet the needs of handling different data scales. Implementation techniques, including locality-aware hashing, deduplication, and compression synergization and pipeline scheduling, will be investigated, evaluated, and validated in real cloud storage platform.