Community Research and Development Information Service - CORDIS

H2020

NEPHELE Report Summary

Project ID: 645212
Funded under: H2020-EU.2.1.1.3.

Periodic Reporting for period 1 - NEPHELE (eNd to End scalable and dynamically reconfigurable oPtical arcHitecture for application-awarE SDN cLoud datacentErs)

Reporting period: 2015-02-01 to 2016-07-31

Summary of the context and overall objectives of the project

The cloud is revolutionizing the internet with a whole new user experience. Cloud services are being rapidly deployed, causing traffic in datacenters to explode. To keep pace with this soaring demand, datacenters are growing in size, hosting tens of thousands of servers and consuming as much electricity as a small town. Scaling-out the datacenter is generating enormous connectivity requirements whereas the emerging concept of resource disaggregation is further raising the bar in network capacity and latency. Traditional datacentre network architectures scale super-linearly with the number of servers, imposing a ceiling on the maximum economically-viable datacenter dimensions. Content providers face the challenge of scaling their infrastructure in a cost-effective manner, in order to improve their services to the end-user. New networking solutions are urgently needed to sustain the booming growth in the cloud ecosystem.

NEPHELE is a European research project on network technologies, developing a dynamic optical network infrastructure for future scale-out, disaggregated datacenters. NEPHELE builds on the enormous capacity of optical links and leverages hybrid optical-electronic switching to attain the ideal combination of high bandwidth at reduced cost and power compared to current datacenter networks. In order to effectively integrate the new paradigm of optical switching into the datacenter networking ecosystem, NEPHELE follows an end-to-end approach extending from the datacenter architecture to the overlaying control plane and up to the interfaces with the application, in order to deliver a fully functional networking solution. Within the project’s workplan a manifold of recent developments and disciplines are leveraged in order to unleash the potential of optical switching in the datacenter:

• novel network architectures for optically switched data plane leveraging mature, off-the-shelf photonic technologies,
• Software-defined networking for network configuration and interaction with the data plane,
• Application-defined networking for interaction with the datacenter cloud management platform.

To blend these concepts into a fully functional end-to-end solution NEPHELE aligns its interdisciplinary approach with the end-user needs, as a means of bridging innovative research with near-market exploitation.

NEPHELE’s hybrid electronic-optical network architecture scales linearly with the number of datacenter hosts and consolidates compute and storage networks over a single, Ethernet optical TDMA network. Low latency, hardware-level dynamic re-configurability and quasi-deterministic Quality-of-Service (QoS) are supported in view of disaggregated datacenter deployment scenarios. A fully functional control plane overlay is being developed, comprising a Software-Defined Networking (SDN) controller along with its interfaces. The southbound interface abstracts physical layer infrastructure and allows dynamic hardware-level network reconfigurability. The northbound interface links the SDN controller with the application requirements through an Application Programming Interface. NEPHELE’s innovative control plane enables Application Defined Networking and merges hardware and software virtualization over the hybrid optical infrastructure. It also integrates SDN modules and functions for inter-datacenter connectivity, enabling dynamic bandwidth allocation based on the needs of migrating virtual machines (VMs), as well as on existing Service Level Agreements for transparent networking among telecom and datacenter operators’ domains.

NEPHELE is developing an end-to-end solution extending from the datacenter architecture and optical subsystem design, to the overlaying control plane and application interfaces. Driven by user needs, the project aims to bridge innovative research in datacenter networking with near-market exploitation, achieving transformational impact in energy consumption and cost that will allow datacenter

Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

The main achievements during the first project period are summarized below:

WP1: Project Management [M01-M36] (leader: ICCS/NTUA)
All necessary documentation for consortium management was finalized and signed (Grant Agreement, Declarations of Honour, Consortium Agreement and Commission-initiated amendment). The first payment was distributed to the partners. Five plenary meetings, two review meetings and one smaller project meeting were organized and mechanisms were established for consortium communication. Project management instruments performed technical planning, risk management and innovation management. Quality assurance mechanisms were setup for project reporting and fourteen reports were generated and submitted to the Commission.

WP2: Optical cloud DC interconnect architecture [M01-M24] (leader: IRT)
Use cases were motivated by the industrial partners and were defined according to a common methodology. The requirements for management and control were collected and the main functions that need to be provided by the NEPHELE system were identified. The data plane specifications were also defined, as were the functionalities of the associated building blocks. Data- and control-plane requirements were linked with the planned implementation and demonstration, for each use case in NEPHELE. Dimensioning and performance analysis were carried out for different traffic patterns. At the control plane, the functional architecture of the NEPHELE system was defined and the information models used to abstract the different network nodes were specified. Interfaces and interactions among the different architectural components were consolidated both for the intra- and the inter-DC cases.

WP3: Hybrid ToR switch and network interfaces [M01-M28] (leader: MLNX)
The tunable transmitter for NEPHELE’s ToR switch was developed and fast wavelength tuning operation was demonstrated with tuning time well within NEPHELE’s specifications. The WSS subsystem was designed and its SPI control interface was developed; however evaluation was hampered by the defective SPI interface operation of the commercial module that was used. Debugging in liaison with the supplier (NISTICA) was ineffective and contingency actions were put in effect. First, an I2C control interface was setup and verified the optical routing functionality of the WSS with the switching time targeted in NEPHELE. However, the module’s reconfiguration rate was limited by the slow communication speed of the I2C interface and was therefore not suitable for the slotted NEPHELE dataplane. To overcome this obstacle, a WSS module was setup according to the “demultiplex, switch and multiplex” technique. The contingency WSS module was evaluated successfully and was applied in WP6 experiments. In the meantime, debugging of the NISTICA WSS is still in progress. The 1×2 fast switch subsystem was developed and tested. The optical power combiner was investigated with several approaches. Following simulations and consideration of potential implications to the overall system, the final approach was consolidated in liaison with WP2 and WP4. Four FPGA boards were specified for the NEPHELE data plane with their exact functionality, I/O requirements and compute/memory requirements. Design and evaluation has been completed for the majority of FPGA software components and interfaces.

WP4: Algorithms and protocol adaptations for efficient QoS provisioning inside and across DCs [M05-M28] (leader: UPAT)
The dynamic bandwidth allocation problem has been defined and formulated according to the WP2 architecture specifications. Two classes of algorithms were developed in a full suite; offline algorithms that schedule a static traffic matrix and incremental algorithms that are better suited to dynamic traffic scenarios. For each class, a set of algorithms was developed and evaluated in simulations, achieving different trade-offs between performance and running time. The effect of control plane overhead w

Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

The proliferation of the cloud application-, platform- and infrastructure-as-a-service models is motivating the construction of new and more powerful datacenters [1]. This is raising the bar in communication requirements not only among the cloud datacenters, but also within them. Today’s datacenters are typically designed with a fat-tree or oversubscribed fat-tree interconnection topology, which are plagued by scalability limitations, rigid allocation of resources and difficulty to adapt to the east-west traffic profiles of modern datacenters. Optical switching has been investigated for transferring aggregated traffic between racks or collections of racks, partly or entirely replacing the higher levels of the electronic tree networks [2],[3],[4]. Several optical switching technologies have been considered such as MEMS, wavelength switching, optical add-drop multiplexers and optical packet switching.
MEMS switches have long reconfiguration times that typically range in the order of tens to hundreds of milliseconds. Therefore, they are typically used in tandem with an electrical packet-switched network. MEMS-based hybrid electronic-optical networks have been reported in Helios [2], Calient [5], and REACToR [6]. One limitation of this approach is the delay introduced by the control plane that serves to classify traffic and handle the network reconfiguration [7], which can go up to the seconds’ timescale. Also, since the radix of MEMS switches is quite limited (up to 320 port switches are commercially available) and building higher-port switches out of smaller ones is complex (due to losses and synchronization issues), the hybrid solution based on MEMS exhibits scalability problems.

Wavelength-switching concepts have been investigated by virtue of the fast reconfiguration time of tunable lasers (in the ns regime), which can be configured to implement a non-blocking switch when interconnected with an Arrayed Waveguide Grating Router (AWGR). A number of initiatives have investigated this concept in different realizations, such as DOS [8], LIONS [9], Petabit [10] and IRIS [11],[12],[13]. On the downside, wavelength-switched concepts face significant scalability issues due to the limited number of wavelengths available in the optical communication C-band.

Optical add-drop multiplexing nodes based on Wavelength Selective Switches (WSSs) offer more flexibility as they combine space- and wavelength-switching. A prominent example is Mordia [14], which uses WSSs similar to NEPHELE’s achieving switching times in the order of 10 μs. However, the Mordia architecture is a flat ring that interconnects racks, which scales badly. In addition to the architecture and physical layer implementation, Mordia has also researched algorithms to ensure fast reconfiguration of the underlying network infrastructure. Although the proposed algorithms are 2-3 orders of magnitude faster than traditional approaches, they still rely on superlinear complexity algorithms. Such algorithms cannot scale to huge datacenters of thousands of racks for medium to high dynamic traffic, which is the case as measured in real datacenters [15]. The main limitations of Mordia remain its scalability and cost.

Optical packet switching is also investigated as an alternative to the hybrid electro-optical approach and in an attempt to shed the electrical switches completely from the datacenter. The Lightness project [15] is developing a hybrid datacenter network that combines optical packet switching (OPS) and optical circuit switching (OCS). The OPS switch provides WDM operation and is based on WSSs implemented with AWGs followed by large SOA arrays in order to perform the OPS between different ToR switches. This SOA-based switching concept has been well-investigated in the past in the context of OPS-telecom networks [16], however it still involves significant hurdles, such as the relatively high power consumption and power dissipation of the switches, as well as the lack o

Related information

Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top