Subventions et des contributions :

Titre :
Knowledge Graph Mining for the Linked Open Data Cloud
Numéro de l’entente :
RGPIN
Valeur d'entente :
115 000,00 $
Date d'entente :
10 mai 2017 -
Organisation :
Conseil de recherches en sciences naturelles et en génie du Canada
Location :
Ontario, Autre, CA
Numéro de référence :
GC-2017-Q1-01684
Type d'entente :
subvention
Type de rapport :
Subventions et des contributions
Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :
Zouaq, Amal (Université d’Ottawa)
Programme :
Programme de subventions à la découverte - individuelles
But du programme :

The development of structured data on the Web, through the linked data paradigm (aka the Web of data), has received considerable attention during the last few years, especially with the adoption of knowledge graphs by companies such as Google and Microsoft to enhance their search engines. The Linked Open Data Cloud (LOD), a set of interconnected data sets across domains, represents the latest developments of the Web of data. We have witnessed a huge development in the number of published RDF datasets. One of the main interests of the LOD is its potential to offer semantic search capabilities such as query answering with direct answers instead of sets of documents, and the ability to a) find answers from various knowledge sources and b) infer answers through reasoning mechanisms.

However, the quantity of published datasets poses new challenges as there aren't any established mechanisms to ensure their quality. In particular, the deluge of RDF data without proper schemas and ontological models is of little interest for efficient query answering. Another challenge is the representation and coverage of current Web content, especially for domain knowledge. Despite the growth of the Web of data, the majority of Web content is still represented in unstructured formats and texts. Thus there is a necessity to expand the current LOD with data and schemas of good quality; to develop methods and tools that learn ontological schemas and knowledge bases from unstructured Web content; to measure and evaluate LOD quality and design correction and completion strategies for current Web structured data; and to develop use cases that concretely show the interest of the Web of data for query answering.

In this discovery grant, we will focus on one particular resource on the Web which is the cross-domain encyclopedia Wikipedia. In the context of the Web of data, Wikipedia is linked to one of the main resources on the LOD, DBpedia, which is the RDF representation of Wikipedia based on Wikipedia infoboxes and categories. Being a hub on the LOD, DBpedia has become a central resource for several semantic analysis tasks. In particular, DBpedia is the backbone of several new industrial and academic semantic annotation services (e.g. IBM's Alchemy, DBpedia Spotlight) which are used to tag Web content and recognize entities and concepts in text. However, DBpedia suffers from the same quality problems previously described in terms of coverage of Wikipedia content, lack of proper ontological schema, and factual errors. The objective of this program is to develop tools and methods to correct, axiomatize and expand the DBpedia knowledge base. We will also demonstrate the interest of the learned knowledge base for better semantic annotation and query answering in the context of the Semantic Web.