Subventions et des contributions :

Titre :
Identifying General Product and Brand Names in Online Forums
Numéro de l’entente :
EGP
Valeur d'entente :
25 000,00 $
Date d'entente :
7 mars 2018 -
Organisation :
Conseil de recherches en sciences naturelles et en génie du Canada
Location :
Ontario, Autre, CA
Numéro de référence :
GC-2017-Q4-01161
Type d'entente :
subvention
Type de rapport :
Subventions et des contributions
Renseignements supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier (2017-2018 à 2018-2019).

Nom légal du bénéficiaire :
Makrehchi, Masoud (University of Ontario Institute of Technology)
Programme :
Subventions d'engagement partenarial pour les universités
But du programme :

One key data component for engaging and acquiring customers is being able to identify names of products inx000D
online forums. Therefore, the extraction of product/brand names will generate data to help vendors and thusx000D
increase overall business revenue. The traditional approach to extract named entities requires time-consumingx000D
and expensive manual human-labelled training sets. In addition, we deal with many product or service sectorsx000D
which come with different size, language, and content. Thus fitting a supervised model which is based onx000D
human-annotated data, for each product/brand to each sector (vertical) can be very expensive. Three majorx000D
challenges are: 1) the high cost of generating training data for each type of product, 2) covering large divergentx000D
types of products from all verticals, and 3) disambiguating different types of entities based on context.x000D
Therefore, in the absence or lack of training data, alternative solution is semi-supervised learning algorithmsx000D
such as bootstrapping and meta-learning methods such as self-training and co-training. In these methods, wex000D
can train a model either with no or very few annotated data. In this research, a hybrid approach combiningx000D
transfer learning and semi-supervised learning is investigated to identify and extract named entities in ourx000D
domain of interest. In transfer learning, the solution for the lack of annotated data in the target domain, is tox000D
adapt annotated data from other domains.x000D
VerticalScope (the industry partner) is a Canadian company that is becoming a leading player in data sciencex000D
research and development for understanding user-generated content on the Internet in a variety of sectors. Thex000D
proposed project contributes to the growth of VerticalScope, and to the Canadian economy as a result, byx000D
allowing the company to apply cutting-edge research for improving user experience on their forums, and hencex000D
the attractiveness of its service to the businesses.