CORDIS - Resultados de investigaciones de la UE
CORDIS

Deep Packet Inspection to Next Generation Network Devices

Final Report Summary - DPI (Deep Packet Inspection to Next Generation Network Devices)

Deep packet inspection (DPI)—in which the content of the packet is inspected and not only its header—is one of the main tasks done by middleboxes in contemporary networks. DPI meets the need of network operators to gain better insight (in finer granularity) of the behavior of their users, in order to allocate the resource of the network, to enhance the quality of service or quality of experience of specific users, and to monitor the network better for security threats. A basic building block in contemporary DPI engines is to match the packets’ payload against a set of patterns (a.k.a. signatures), which, for example, indicate malicious activity.

While the general problem of pattern matching is fundamental in computer science and has been researched thoroughly over the last decades, traditional algorithms fail to face current challenges of recent networking devices. Specifically, we have identified the following problems in traditional pattern matching algorithms: (i) limited scalability; (ii) geared to work only with clear-text input; (iii) vulnerable to cyber attacks, such as algorithmic complexity attacks; and (iv) oblivious to the setting in which it runs thus missing many optimization opportunities. In this project, we have addressed these challenges one by one.

First, the rapid increase in Internet traffic rates calls for significantly more scalable design in terms of speed and memory usage. We have designed a proof-of-concept system that shows that DPI engines can leverage from repetitions in the traffic to increase their speed. In addition, we have shown how DPI engines can leverage from advances in other networking devices (e.g. IP address lookup chips, TCAMs) to boost up their performance.

A second challenge arises from the large share of compressed web traffic (e.g. due to the usages of mobile devices). Our research group was the first to algorithmically deal with such compressed traffic, showing unique solutions for variety of compressed traffic types, as GZIP and Google’s SDCH. Our solutions show that, unlike common belief, the compression itself can facilitate data inspection, as, intuitively speaking, one has already scanned the traffic and eliminated repetitions.

Third, unlike traditional pattern matching algorithms, DPI solutions must be resilient to cyber-attacks that aim to knock down the DPI engine. This is especially crucial in the common case where DPI engines are part of a Network Intrusion Detection System. We have demonstrated the vulnerability of existing systems and then designed a novel framework, for mitigating such attacks in a multi-core and NFV setting.

Finally, we have considered the interaction with middleboxes, where DPI engines reside. Nowadays. This interaction is significantly influenced by two complementary initiatives: Network Function Virtualization (NFV) which enables relatively light-weight software implementation of middleboxes and Software-Defined Networking (SDN) which enables efficient forwarding of packets between the virtualized network functions. Within these frameworks, we have developed a system that treat DPI as a service to the middleboxes, implying that traffic should be scanned only once (and not separately for every middlebox, from scratch). Having DPI as a service has significant advantages in performance, scalability, robustness, and as a catalyst for innovation in the middlebox domain. Based on the insights with this approach, we have also developed OpenBox, a more general framework that provides other common building blocks as a service.

Complementarily to our work on DPI engines, where the signatures were given as an input, we have also presented a pioneered system for zero day attack signature extraction for the DPI engine. Given two large sets of messages: messages captured in the network at peacetime and a captured during attack time, we present a tool for extracting a set of strings, that are frequently found in attack time and not in peace time. Using our system, a yet unknown attack can be detected and stopped within minutes from attack start time.