Skip to main content

Scalable Switching Architectures for Next-Generation Data Center Networks

Final Report Summary - SCALE (Scalable Switching Architectures for Next-Generation Data Center Networks)

The exponential growth in networking with the emerging trend in cloud computing is driving the need for building massive data centres capable of holding tens to hundreds of thousands of servers to sustain and support various online services (e.g. web-search, video content hosting and distribution, social networking and large-scale computations, etc.). This growth is stressing the need for Data Centre Networks (DCN) infrastructures (routers and switches) with high bit-rates, very large port numbers, faults tolerance and load-balancing capabilities. Despite the numerous existing studies, a switching architecture that addresses all these requirements is still to be discovered.

The objective of the “scale” project is to design scalable switching architectures capable of handling the projected growth of data centres in applications and user traffic. In particular, the project aims at: i) Designing a DCN fabric architecture that is scalable, cost effective, fault-tolerant, agile and reliable. The envisioned design will incorporate efficient transport capabilities of the data centre. ii) Developing efficient DCN scheduling, resources allocation and routing algorithms that can provide controllable latencies and guaranteed bandwidth. iii) Proposing efficient flow-control mechanisms to coordinate the overall state of the DCN and prevent congestion and performance degradation. (iv) Developing techniques for the validation and evaluation of the proposed topologies, algorithms and methodologies for various data centre applications.

The project has achieved all the expected results, which are in-line with the time plan. In particular, a
number of tasks were performed to achieve the project objectives. During the first phase of the project, the following results were achieved. We have carried a detailed background study of the DCN architectures both the whole DCN level as well as the node (switch) level. At the DCN level, we have studied and analysed the DCN traffic flow. This has resulted in a number of results (research output) targeting the DCN traffic congestion in order to adequately dimension and architect the switch fabric infrastructure. We have studied the DCN congestion control using both implicit and explicit rate control algorithmic techniques. This has shed lights on the DCN switch fabric design choices, in terms of flow scheduling as well as buffer requirements. The next major result has been related to the switch fabric architecture. A number of switch fabric architectural alternatives have been proposed and tested. These architectures are based on joining the two concepts of multi-stage interconnection networks as well as the Network-on-Chip (NoC) paradigms. In particular, a multi-stage DCN switch fabric architecture, based on the unidirectional NoC topology, has been proposed. The proposed design has been evaluated against existing proposals and has been shown to exhibit superior performance merits while being affordable at lower cost.

The results during the second phase of the project were related to the node level design of DCNs. In particular, various switching architectures have been analysed, designed and tested. This includes: i) The derivation of the necessary and sufficient conditions for a NoC-based multi-stage switching architecture to deliver performance guarantees, ii) Proposing novel multi-stage clos network architectures based on NoCs, with optimal buffering as well as switch resource allocation algorithms, iii) Designing DCN switch fabrics with load-balancing capabilities as well as intra-switch congestion avoidance mechanisms. In order to evaluate the proposed architectures and validate all the above results, an appropriate simulation tool was developed along with specific traffic traces and scenarios and were tested against known benchmarks from the DCN research community.

The results obtained during the course of this project have helped us gaining insight on how to design next generation DCNs and have given us deeper understanding of how would optimal DCN switch fabric architectures be designed. The success of this research will have a two-fold wider benefit. First, a scalable DCN architecture translates into higher throughput networks, allowing cloud-computing companies to provide more services and generate higher revenues. Second, the design of efficient DCN would result in better end user (general public) experience and more satisfaction with faster and more reliable online services.

Contact:
Dr. Lotfi Mhamdi
e-mail: L.mhamdi@leeds.ac.uk