Subventions et des contributions :

Retour à la page de recherche

Titre :

Performance management tools for big data applications

Numéro de l’entente :

CRDPJ

Valeur d'entente :

68 000,00 $

Date d'entente :

18 oct. 2017 -

Organisation :

Conseil de recherches en sciences naturelles et en génie du Canada

Location :

Alberta, Autre, CA

Numéro de référence :

GC-2017-Q3-00370

Type d'entente :

subvention

Type de rapport :

Subventions et des contributions

Renseignements supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier (2017-2018 à 2018-2019).

Nom légal du bénéficiaire :

Krishnamurthy, Diwakar (University of Calgary)

Programme :

Subventions de recherche et développement coopérative - projet

But du programme :

The proposed project will develop tools that will allow users and operators of big data systems to better analyze and optimize the performance of their applications. Enterprises are increasingly relying on big data cluster systems to support their analytics requirements. Newer generation big data platforms such as Apache Spark provide a rich set of Application Programming Interfaces (APIs) that allow data scientists to quickly develop and deploy analytics applications. However, despite recent advances in big data platforms, analyzing and optimizing the performance of big data applications remains a challenge. Users and operators of big data clusters often face challenges in understanding the reasons for poor application performance and optimizing applications and clusters to meet performance requirements. x000D
To address these challenges, the proposed project will focus on two long term objectives. First, the project will devise an intuitive and intelligent interface that will support rapid performance analysis and optimization of big data applications. Using the interface, a user will be able to quickly identify bottlenecks that limit an application's performance. The interface can also be used to visualize and understand an application's performance scalability. Such features can in turn lead to increased productivity for users and more efficient usage of clusters for operators. Second, the project will develop automated service level management techniques for big data systems. Typically, big data systems use schedulers to control the allotment of resources to applications. Although production scheduling systems are sophisticated, they currently lack the ability to drive scheduling decisions so as to automatically meet user service level requirements, e.g., a specified deadline for completing an application, while simultaneously achieving efficient resource usage. Consequently, users and cluster operators often resort to tedious trial and error approaches, which can cause both missed service levels as well as inefficient resource usage. The proposed project will devise new scheduling techniques to address this limitation. x000D
x000D
x000D