Subventions et des contributions :

Titre :
Co-Designing Distributed Applications with Datacenter Networks
Numéro de l’entente :
RGPIN
Valeur d'entente :
210 000,00 $
Date d'entente :
10 mai 2017 -
Organisation :
Conseil de recherches en sciences naturelles et en génie du Canada
Location :
Ontario, Autre, CA
Numéro de référence :
GC-2017-Q1-01867
Type d'entente :
subvention
Type de rapport :
Subventions et des contributions
Informations supplémentaires :

Subvention ou bourse octroyée s'appliquant à plus d'un exercice financier. (2017-2018 à 2022-2023)

Nom légal du bénéficiaire :
Wong, Bernard (University of Waterloo)
Programme :
Programme de subventions à la découverte - individuelles
But du programme :

Scientific breakthroughs increasingly rely on processing vast volumes of information to discover important patterns and relationships in the data. This requires far more processing power than is available on a single machine. Instead, the work is partitioned and distributed to a cluster of machines connected by a high-speed network. Unfortunately, the communication traffic between machines in a cluster can overwhelm parts of the network, resulting in network congestion that limits the scalability and performance of the processing application. The problem is further compounded by traffic from other applications when the processing application is deployed in a shared environment, such as in the public cloud. As a result, scientific progress may be delayed or even halted because of inefficient network resource utilization.

The proposed research program aims to address this problem by designing applications together with the network to make more efficient use of network resources. This approach enables applications to use up-to-date network information to make application-related decisions. For example, an application may decide to communicate with a different destination in response to network congestion. The application can also control the network to ensure that its traffic is spread evenly across the network. Similarly, the network can leverage application information to dynamically change its network topology. The network can reduce network congestion by reconfiguring its topology to increase its network capacity between high-traffic end-points. As part of this research program, we will perform measurement studies to understand the causes of network congestion in different classes of applications. We will develop an application framework to simplify the development of applications and networks and promote adoption of this design approach. We will also develop a network operating system to provide resource management, conflict resolution, and policy enforcement when running multiple applications on the same network.

The results of this work can benefit the scientific community by addressing network-related scalability limitations. This enables the deployment of larger processing clusters that can process more data than is currently possible, which can help Canadian scientists make the next scientific breakthrough. This work can also benefit the Canadian commercial sector by investigating technologies that can significantly reduce the network infrastructure cost of clusters and datacenters. Cloud providers can use our network operating system to safely provide their tenants control over segments of their networks. Finally, by open-sourcing our systems and working closely with Canadian industrial partners, we can rapidly transfer our technology to Canadian companies, which can give them a significant competitive advantage in the global market.