Community Research and Development Information Service - CORDIS

Final Report Summary - GRAPHRULES (GraphRules: Rule Discovery, Exploration and Visualization of Collaborative Graph Structures)

The Graphrules project dealt with the development of techniques that facilitate data-mining and visualization on large data graphs. We have built two working prototypes within this frame. The first one dealt with the development of an industrial recommender system. The second one focused on indexing techniques that help store and index streaming graph data. Several publications and patents emerged as a result of these works. In addition, the Marie Curie grant helped Dr Vlachos’ continue his previous research projects. Some of these contributions can be found in the Publications section below.

As part of this effort we have built a complete recommendation system for Sales and Marketing teams within IBM. For this effort we have received two awards from the Research Division of IBM. The first one was for a pilot project in Switzerland. The second award was for the extension and global deployment of our platform within IBM.
Our first pilot system was called “SmartRep”, and was deployed within IBM Switzerland. Sales and marketing teams obtained a multifaceted overview of their customers. This required aggregation of different databases, internal and external ones. Internally we merged firmographic and financial information. We also fused external intelligence databases and RSS feeds extracted in real-time from the web.
On top of this information we built advanced analytics that allowed the marketing teams identify the co-clusters in each industry. By co-clusters we mean clusters of customers that buy a set of similar products. This allows for an easy visual inspection of industry and understanding of its trends. In addition we built various data visualizers that allowed the easy exploration of the extended customer base of IBM Switzerland.
An extension of this work, called “Crystal+”, was deployed across various geographies of IBM (Europe, Japan, Australia). Additional technical accomplishments of this work included a novel entity resolution framework, which allowed the consolidation of various internal and external databases of clients. This enabled us to build richer analytic and recommendation models based on a complete 360 degree view of a client account.

As part of Graphrules, we wanted to examine algorithm for processing high-bandwidth streaming graphs. For this reason we picked the scenario that extended the most demanding specifications: network monitoring applications. As an example, consider the monitoring of the changing connectivity graph between computer nodes, which is an important topic in applications such as: intrusion detection, forensic analysis of network worm epidemics. A visual illustration of such a constructed graph is shown in Figure 3. However, in order to support fast mining and visualization operations over this network graph, there is a need to support real-time archival, indexing and querying over the streaming data.
NET-Fli is a high-performance solution for streaming graph monitoring. It can sustain data rates that exceed 1 Million records per second. To put this number into perspective, existing solutions for network data currently process 20k-60k flows per second. More importantly, the system offers interactive query response times, allowing the performing of graph querying on-the-fly.

Related information

Documents and Publications

Reported by

IBM RESEARCH GMBH
Switzerland
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top