Subventions et des contributions :

Titre :
Deep neural network-based speech enhancement for robust speech recognition in smart home device
Numéro de l’entente :
CRDPJ
Valeur d'entente :
277 500,00 $
Date d'entente :
7 févr. 2018 -
Organisation :
Conseil de recherches en sciences naturelles et en génie du Canada
Location :
Québec, Autre, CA
Numéro de référence :
GC-2017-Q4-00307
Type d'entente :
subvention
Type de rapport :
Subventions et des contributions
Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier (2017-2018 à 2020-2021).

Nom légal du bénéficiaire :
Champagne, Benoit (Université McGill)
Programme :
Subventions de recherche et développement coopérative - projet
But du programme :

Human-machine interfaces based on natural speech have advanced to the stage where previously unthinkable applications are becoming part of our daily life. Speech interfaces not only facilitate human-machine interactions but also significantly enhance the efficiency of home automation, which is a key driver of internet of things (IoT). Smart home devices (SHD) are now commercially available that allow users to control their home gears remotely and access Web-based information sources. These intelligent assistants can respond in real-time to human voice commands via automatic speech recognition (ASR). However, for SHD to operate satisfactorily under real-world conditions, they must be robust to acoustic noise and reverberation, a critical problem whose solution calls for new speech processing technologies. The long-term goal of the project is to develop an integrated speech enhancement (SE) system based on deep neural networks (DNN) to support two essential SHD functions: keyword spotting and cloud-based ASR. Over its 3-year duration, the project aims to achieve the following objectives: develop new feature sets for the representation of noisy speech; design improved DNN core engines better suited to the SE task; implement a complete DNN-based SE system that is robust to noise and reverberation; and finally, evaluate its performance within a multi-microphone SHD context. This proposed research is an extension of an on-going NSERC CRD project with industrial partner Microsemi. During the past two years, our team has developed state-of-the-art SE algorithms that display excellent performance when tested on human listeners. However, these algorithms are not optimally designed for use as pre-processors to ASR, as needed in the new line of integrated circuits (IC) being currently developed by Microsemi for voice-driven SHD. The proposed research will provide our sponsor with cost-effective and innovative SE solutions for use in their IC products, boosting its competitiveness on the marketplace. In addition to technology transfer, the project will promote the research and the training of HQP in intelligent speech processing at McGill and Concordia.x000D
x000D
x000D
x000D
x000D
x000D