Subventions et des contributions :
Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)
Video data is considered as biggest big data. We have witnessed gigantic volume of data generated at a very fast speed in recent years. According to Cisco Systems, video data counts for around 80% of the Internet data in United States. Also, YouTube users upload more than 300 hours’ video clips in every minute. Besides the large volume and velocity, the irregularities and ambiguities (variety property) of the video data make it difficult to compare, understand, and annotate the data using traditional technology and systems. In the proposed research program, our objective is to systematically address the challenges in understanding the complexity of multimedia big data; to handle scalability; to make trade-off between efficiency and accuracy of various solutions; and to design and implement systems to organize, classify, search, and retrieve multimedia contents.
Specifically, we will conduct the following research activities, but not limited to (1) the investigation of video-feature representation based on individual frame features while preserve the temporal and spatial correlation among frames, (2) the development of underlying operating systems and indexing structures to support parallel computation on multimedia data in multi-core and distributed environments, (3) the use of semi-supervised graphical model to downplay the dependency of the state-of-the-art deep learning models on large size of high-quality training sets by exploiting the joint distribution between labeled and unlabeled data; (4) the use of systems approach to iteratively optimize individual components and system integration, which will eventually lead to both locally and globally optimization.
My research group has been working on multimedia big data since 2013, and we have made a great progress on building a testbed to measure the performance of various combinations of video features, developing a pipeline architecture to support parallel multimedia content computation based on SPARK streaming, and implementing a prototype for multimedia data storage and indexing on a single multi-core server. We will continuously apply multiple system-oriented strategies to optimize the system performance, and extend the systems on GPUs (Graphics Processing Units) servers and in distributed environments. Meanwhile, the testbed, prototype system, as well as the models and theories that we have developed in the past a few years will serve as the basis and tools in design, implement, and verify the proposed research in the future.
On the frontier of multimedia big data research, the proposed research program will also provide industrially-relevant and the most up-to-date training for HQP, helping to equip them with the skills and knowledge to make an impact in the rapidly growing big data industry in Canada.