NET INFORMATION INTEGRATION SERVICES FOR SECURITY SYSTEMS

Final Report Summary - NI2S3 (Net information integration services for security systems)

Executive summary:

Complex interactions between the elements of a critical infrastructure (CI) indicate that there is a need to deploy a corresponding infrastructure protection system, which is capable of extending security control to all elements of the protected system, and, at the same time, of maintaining a global view of the infrastructure.

The key objective of the NI2S3 project is to research and implement a reference methodology for developing security systems based on network enabled capabilities (NEC) information and integration services (I2S) for CIs. The security systems must be capable to collect and process information from many heterogeneous sources in order to build up or improve the situation awareness of CIs and to enable the decision making. More specifically, the NI2S3 project aims:

(a) to provide a definition and a design of an NI2S3 CI protection system regarding the security, resiliency and availability of the subject infrastructure;
(b) to define performance indicators and tools for system validation;
(c) to develop a technology for the evaluation of the performance, robustness and reliability of such protection system;
(d) to develop a NI2S3 application demo as a proof of concept.

The NI2S3 project is focused on the research and development of a reference methodology to guide the design and the implementation of security systems for CI protection, basing on the philosophy and the concepts of the network enabled capabilities (NEC)-based systems approached with service-oriented architecture (SOA) techniques.

The refining and validation of this methodology is performed by an application demonstrator, realised in accordance with NEC and SOA concepts. Therefore, the practical implementation and commercialisation of a real NI2S3 will require a 'step ahead' in this direction, not addressed by the project itself.

Project context and objectives:

The activities within the project are articulated into seven work packages:

Management,
- Analysis of the state of the art,
- Definition of scenarios, analysis and extraction of the system specifications,
- Development of a reference methodology for design, and realisation of a NI2S3,
- Definition of a set of metrics and validation capabilities for the components and the protocols involved in NI2S3,
- Project and design of prototype,
- Dissemination and exploitation.

Ni2S3 project completed the activities by 30 September and achieved the following main results:

- Proposal of a reference architecture for critical information infrastructure (CII) A new comprehensive architectural framework named critical architecture framework (CRAF) was proposed starting from a base reference (the open group architecture framework architecture (TOGAF) development method (ADM) specialised for CII) and extending it with contribution of other selected architecture framework, namely:
(1) Department of Defense Architecture Framework (DODAF) to present viewpoints of the enterprise architecture:
(a) COBIT to manage IT process life cycle;
(b) SABSA concepts to analyse security aspect of the architecture;
(c) OGAF ADM phases for CRAF sub-methodologies:
(i) data acquisition;
(ii) data fusion and correlation;
(iii) vulnerability assessment (VA).

The resulting methodology was applied to a demonstrator case with the objective to monitor and control an Highway analysing and protecting the information infrastructure which is underpinning such application environment.

- Improved techniques for situational awareness.in CII: To improve the security and the overall control of the whole system it was suggested and included in the demonstrator a situation awareness mechanism relying on events correlation n a general architecture service-oriented architecture (SOA)-based. This has been used for correlating application sensible events and infrastructure data and events so enhancing the standard capabilities of a simple infrastructure monitoring using temporal and logical correlation templates enabling proactive security mechanism mainly in distributed systems sensible to cyber attacks.

- Exploitation of SOA architecture for implementing NEC concept of information centric system: The project, both through demonstrator and in the analysis exploited SOA as a way of communication and integration adopting OASIS standards. In particular a gateway from supervisory control and data acquisition (SCADA) systems and a standard enterprise service bus (ESB) was developed demonstrating the suitable adaptation of SOA architecture to distributed data acquisition and control environments.

- VA methodology and tools: VA analysis and tools where applied and investigated suggesting a specific attention to this subject through dedicated design and testing activities in case of CII, supported by validation tools and some guidance on how to address a measurement of the robustness and reliability of a designed infrastructure. A demonstration tool for general application of stress and validation tests was also implemented.

- As a proof of concepts a demonstrator was finally developed where all the outcomes of the research phase and methodology construction were put in actual implementation. The demonstrator has a special focus on the relationship to SCADA systems which due to the evolution of the embedded systems and the ability to integrate and interconnect low level controlling systems in distributed architectures will reserve a special care on protection of the infrastructures which should enable this type of scenarios.

Critical transportation systems have an intrinsic transnational value, so that the most suitable instrument to achieve advances in the protection of such infrastructures is transnational cooperation.

NI2S3 confirmed to be a cross-border project, which will give the chance of develop technology and reference methodology for developing CIs security systems that can be better accepted by the potential stakeholders, being them designed under the guidelines of each of the participant's country needs.

NI2S3 outcomes address scenarios which are emerging in all the distributed systems architecture able to collect data from remote sites, collect them to analyse the resulting information and (this is the new and important stage in the next future) to correlate in temporal and spatial. These systems are typically net centric and are becoming a significant part of the services delivery so becoming more and more critical just in the sense this project intends to address the proposed solutions.

Project results:

The research carried out in NI2S3 project, as a starting point defined and described the methodologies and technologies state of the art for the protection of CIIs. Then as a first step of the outcomes, has identified the main gaps that currently exist. The gaps identified can be transposed in terms of concerns that are explained by the sequent questions:
- Which is, among all the existing architectural frames, the most suitable for the design of a CII protection?
- Which is the most suitable architectural solution and what technologies can be adopted to meet the guidelines of an NEC-ISTAR?

The NI2S3 research activity provides effective responses to these crucial question; in the following chapters the reader can find the most important aspects of the results achieved.

CRAF: A new approach to modelling CI / CIIP

CRAF methodology was used in the NI2S3 project as a reference tool for the description of CI. CRAF aims to drive the effort in architecture design for CI providing a clear and shareable description of the enterprise architecture and by addressing all of the essential concerns pertinent to a CI design and operation.

The central idea of creating a domain-specific AF for the CI is the preparation of a common space in which all stakeholders including top-level managers, executives and architecture designers could express all major concerns pertaining to embedding CIIP in CI in a secure and transparent way.

The main concept for CRAF is based on TOGAF and the ADM. The TOGAF methodology provides a comprehensive method for developing enterprise architecture. The section on the diagrams to be developed during creating the architecture derived from the DODAF methodology. DODAF focuses on concrete viewpoints and provides guidelines for representing DODAF architecture products. A good framework to manage issues that occur in TOGAF is COBIT, and that is why some of its processes have been incorporated into CRAF. The additional part of CRAF is composed of sub-methodologies. These are guidelines coming from experience and showing such aspects as data fusion, aggregation and correlation, VA and data acquisition in a useful way in the context of ADM phases.

The main points of that research related to creating CRAF are:

- methodology,
- meta-model,
- viewpoints,
- techniques,
- guidelines,
- best practices ,
- design patterns.

A CRAF meta-model is a repository of terms used to represent the domain of interest. It is required during the AF development to name and to refer to all the important elements of architecture. A meta-model is a common, generic vocabulary and therefore requires completeness, consistency and minimality and can be based on a generalised meta-model, such as for example DM2 from DODAF2 or Content Framework from TOGAF.

The CRAF framework inherits viewpoints from the DODAF framework. The purpose is to enable human engineers to understand and manage very complex systems. The CRAF viewpoints are collections of views that represent architecture data in the human readable way and that are concrete instances of the architecture data captured in a model. On the basis of the defined viewpoints consisting of multiple selective models and selective views of a CI enterprise it becomes possible to incorporate the domain specific knowledge into the framework with the use of the domain-specific concepts from the meta-model. During that activity the set of models with predefined processes, the enterprise design patterns and the structures library are created.

CRAF inherits the baseline methodology from the Meta Architecture Framework. The most important part of this methodology is the ADM - a step-by-step instruction guide for an enterprise architect. It is presented in a series of phases that guide a team through the process of architecting systems with particular focus on selected fields especially important in CI and CII domain.

VA for CII

The main issue in building complex systems, and in particular in CIs environments, is the integration of devices, either hardware or software, from different vendors and manufacturers and the communications among them and central systems.

Even in the case of CIs created from anew, as the time goes on the problem will arise. This is due to multiple causes, e.g. the obsolescence of some equipment, the impossibility to replace devices with the original elements and so on.

Moreover, for civil CIs, another source of this kind of issue is due to the continuous need for integrating data from different sources. Most of the data sources that could be integrated cannot be foreseen at the time of the CI creation, and it is often impossible to guarantee specified software development standards for the majority of the cases.

This means that, in order to protect the CII functionalities, every element should be tested and validated not only for its functionality but also considering possible vulnerabilities. As it will be clear in the following, vulnerabilities might be responsible for either attack from malicious users or malfunctions in the system. Both cases might as well lead to the information infrastructure collapse, or in general to an instability and distrust of the CII itself, thus basically making it useless.

In order to allow the needed flexibility and, at the same time, maintain the security goals of the CII, it is thus imperative to have a methodology to identify the risks connected to the CII, and a set of reliable tools to inspect and test either single elements or a whole subsystem, so to avoid or at least know what vulnerabilities are in place. Note that both cases are fine, as knowing that a vulnerability exists allow the predisposition of countermeasures, so to reach the target security goal.

The research has mainly concerned with the networked entities and the vulnerabilities arising from the communication channels. That is, we will consider mainly the effects of receiving wrong formatted messages (syntax) or messages not following the standard protocol sequence (semantics). We will not consider, for example, the problems arising from wrong application design that could lead to vulnerabilities in absence of external (networked) influences, nor those arising from physical issues. This choice is due, again, to the consideration that in a CII the main problems are related to the interconnection of systems that cannot be judged as reliable in the integrated environment, as is that might be 'secure' in an isolated subsystem, being very likely possible that they might exhibit unexpected behaviours when interconnected. As a simple example, consider two VoIP systems from two different vendors: they might work perfectly fine alone but not when connected together.

Vulnerability analysis and taxonomy

As said above, from an ICT perspective, a CII is a multilayer and interconnected system, the threats of an n-tier application can be analysed from different levels; in the following it is a proposal 'multi-level' vulnerability taxonomy:

(1) Host-level vulnerabilities - at this level the vulnerability categorisation addresses not the assets of a IS but it considers only a node (like a simple host or a server) at time. The node threats can be categorised as follows:

(a) As caused by transient code - intentionally or accidentally downloading (or executing) a piece of not trusted code can represent a severe security risk. This is what happens when a virus, a malware or a spyware is executed in a node; although this aspects are partially out of the scope of this deliverable, all the virus / malware / spyware-related issues can be considered, broadly speaking, as relative to a configuration issue (Sec. 2.2). In a trusted environment no software should be installed or run unless it's classified as secure. Hence, if a user can be attacked by a virus / malware / spyware, it is a configuration issue. In a properly configured system a not-administrator user should not be able to run not trusted software neither in a host nor in a server. In general this last example was the first and most famous non-functional vulnerability highlighted in the systems that implied a specific information and communication technology (ICT) sector to be developed around it and a specific line of validation consolidated in the systems design. Nevertheless this is only an example and with the continuous growth of distributed systems many other similar categories shall be discovered and managed in a standard approach.

(b) As caused by resident code - this class encompasses all the vulnerability sources related to trusted software . For a standard host, as described in Sec. 2.10 one of the most relevant issues related to the exposition of critical data, like: the not encrypted caching of sensitive information, recoding a password key ring in a plain text file, etc. For a server one of the main vulnerability sources is related to the buffer overflow issues; this kind of issue can be causes by an incomplete (or inconsistent) input validation (Sec. 2.1) and it belongs to the bigger set of the memory management issues (Sec. 2.4).

Both cases can and should be prevented by using proper policies about software testing and system configurations, at least on the critical systems.

(2) infrastructure-level vulnerabilities - in a CI the information exchanged through the communication network plays a crucial role. For these reasons the transmitted data must be encrypted and the sender/receiver authenticated. In the NI2S3 deliverable D4.1 has been shown how adopting broken cryptograph algorithms, using not reliable pseudorandom generators or relying on not properly configured cryptographic frameworks can represent a severe vulnerability source. Other vulnerability sources related to the infrastructure level does exists but they does not address directly a distributed application (for e.g. the security problems relative to the wireless communication links or to the security of routing protocols ) but they are out of the scope of this deliverable and will not be addressed.

(3) application / service-level vulnerabilities - this kind of vulnerabilities directly addresses a distributed application considering it as a whole and not as composed by several processes executing in a server/host pool. Also at this level the inconsistent input validation issues as special-purpose programming language (SQL), lightweight directory access protocol (LDAP), XML path language (XPATH) or cross-site scripting injections (XSS) represent a severe risk for the CII because: they can lead to buffer overflow issues or can reveal to a malicious user sensitive data. Then, as discussed in D4.1 a distributed application should never return to the end-user a too detailed error message (including, for e.g. the application version, the stack trace, etc.) for these reasons the exception handling issues represents another remarkable source of risk. Then one of the main access control vulnerability issues is relative to the session management and how it is managed at the server / client size. Directly connected to the session management issues are the violable prohibition/limit issues because, for e.g. an application where sessions never expire after an inactivity period is exposed to the session fixation issues. Finally, strictly related to IS subsystems relying on the Web Service paradigm are the vulnerabilities sources described in Sec. 2.2.2: in this kind of system the service description / discovery issues, jointly to the messaging issues, are reparable remarkable vulnerability sources.

It is worth pointing out that in the above list the cases of vulnerabilities due to incompatibilities (hidden or explicit) are spread among various classes.

The class of errors originated from incompatibilities can be (sometimes) classified in the 'conformance testing' area, however we believe that there is a gray area where those issues belongs more to vulnerability rather than conformance. The difference is where (or not) the incompatibility will trigger an unknown and potentially bad behaviour in one system or not. Assuming that every communication protocol specify a set of mandatory features and a set of optional features, every networked element should implement all the mandatory ones, and react gracefully to all the optional features that are not implemented. This of course assuming that the protocols are defined in a correct and non-ambiguous way, which is not always true.

The scope of conformance testing is to check that the mandatory features are correctly implemented, but by definition it will not check the optional ones.

The scope of vulnerability testing is to check that the optional features are not harming the system stability (amongst other things).

This point is particularly important for civil CI, as it is almost impossible to ensure that every single component has been verified with conformance testing, and even in this case the multiplicity of vendors and sources make it impossible to assume that all the systems and sub-systems will not exhibit hidden incompatibilities.

As additional consideration it is also true that moving towards the 'internet of things' we are opening to external communications (in the effort to integrate and fuse data at highest level possible in the monitoring and control of CI) systems originally designed to work in a closed environment (SCADA systems are a relevant example in this). Consider a generic CI (the highway management CI is perfectly fine). The CI will have most probably a set of data centres on an Intranet, an Extranet and a connection to the normal Internet for the communication with the users and other institutions. Over the three kinds of networks, some services will be deployed, some using proprietary standards and some using 'open' standards. Nowadays many services are implemented through web services, and among them the SOA paradigm is one of the most common choices. It is not worth discussing here the pros and cons of simple object access protocol (SOAP), but one of its uses is to build an easier and faster time-to-market solution to integrate different data sources and consumers among an Information Infrastructure.

When we look at the SOAP, however, it is clear that the system reliability can be hindered by a number of vulnerabilities, and the most common ones listed in literature are:

- Payload / content: SQL injection, XPATH injection, XML as carrier of malicious code and content, abnormal size of the content, deep level of XML element nesting.
- Schema poisoning attack: Attempt to compromise the schema in its stored location and replace it with a similar one.
- Infrastructure attack: denial-of-service (DOS) attack, domain name system (DNS) poisoning of the CA used by the SOA infrastructure to validate signatures, internet protocol (IP) attacks.

The first kind of attack is a clear example of the issues arising from flawed implementation and / or faulty code design. VA techniques can and should be used to find and mitigate the vulnerabilities in implementations. This can be either done via software patching (or rejecting the software vendor), or by building a validator protecting the faulty implementation. The second kind of attack is a typical example of indirect attack. In this case the attacked entity is the storage of the schema, but the final destination of the attack is not the storage itself. Again, VA should had been used to assess the security of the schema repository. The third kind of attack is more a system attack, and it had to be prevented or mitigated during early electric arc furnace (EAF) analysis, by building a resilient network or by deploying countermeasure systems able to block such kind of attacks.

In all the three cases, the risk analysis should had been carried out as early as possible in order to find the importance of those events based on the outcomes of them. A proper analysis should have also the goal to point out if there are parts of the infrastructure that are more sensible than others, in order to justify an eventual duplication or segregation of some services, so that a fault in one of them would not cause a major breakdown in the whole infrastructure.

About the segregation topic, it is worth mentioning that a common mistake in CI, and in any system in general, is to consider the same level of security to be applied on the whole system. This is certainly not a mistake per se, however it implies that every single part of the system should have and maintain the highest security possible, and this is a rather optimistic assumption.

On the other hand, one of the very first assumptions in any security model is that the endpoints of the communication are not compromised, and the attacks from the local link are always the most dangerous.

Hence, it is of paramount importance to not underestimate the necessity to segregate the services and the networks, either logically or, when needed, even physically. In this case, when a network part should communicate with another, all the communications will have to pass through well-defined and controllable gateways, so that it will be possible to enforce in a few points the security needed.

Last but not least, it is worth to point out a small issue that, at the time of writing, is mostly underestimated. IPv4 addresses are not anymore available. The next years will see either a massive use of carrier-grade NATs, or the use of IP version 6 (IPv6). In either case the amount and kind of attacks to that part of the infrastructure are not yet well known, as the deployment of such systems is not enough spread to have sufficient data. Early reports for IPv6 have shown, however, that the kind of attacks to the IP infrastructure are not less than the ones in IPv4, they are only different. Hence, most probably, the chain's weakest link for this part is not the technical infrastructure itself, but rather the lack of preparation by the network administrators and their inability to fully grasp the consequences of the changes. Even in this case, or better especially in this case, a correct EAF application (CRAF in NI2S3 case) should be the guideline to fully understand the risks and the need for technical solutions in order to mitigate the risks.

VA methodology breakthroughs

The analysis done so far, have pointed out that the risk analysis, and in particular the VA, are critical points for any CI.

While some parts of the analysis can be done with common or known approaches, we feel a huge gap in the VA procedures that are in the best cases, jeopardised and partials. The major issue is represented by the availability of tools and platforms, their costs and the narrow scope of the available tests. Even the best ones are seldom programmable, requiring huge investments to be adapted to custom protocols. Moreover, most of the known platforms are commercial and closed-source. Hence, it is not possible for the user to add functionalities or customise them.

The proposed NI2S3 methodology for VA is to reverse the idea that the VA tests should be done only on selected and extremely sensitive elements. Rather, it should be applied on any networked element, to reach the knowledge of its vulnerabilities.

The point, again, is different and opposite to the common one.

Usually enterprises define levels of security among their networks, and according to those levels a specific set of tests have to be carried out. Only in the highest-level s the VA is an integral part of the tests to be passed, while in lower clearance levels only a conformance test might be required. In some cases just an integration test is necessary (e.g. check if two elements are able to communicate).

This, in our opinion, is not helping security and risk management at all. It is basically like running a network without knowing what it could happen if something slightly unexpected would happen. Even applying failsafe techniques like single points of I / O or SCADA, or web services does not help in an increasing complexity system. Moreover, the lack of knowledge is compensated, usually, by restricting usage, or by hindering the system's capabilities. On the contrary, we do believe that VA, applied to every system, can enlarge the knowledge of system's behaviour and help slimming down the checks, allowing a faster and more fruitful system design and, globally, simplifying the system management.

Toward this end, we do need two main elements: a methodology and a set of tools able to enforce the methodology.

The methodology empathise the best practice rule: do it early, do it often and do it for all networked systems and elements. The VA platform designed in the NI2S3 project support the engineers to do that.

Architectures and technologies in NI2S3

Assumptions

Even before providing the most important aspects of the architectural solution proposed in the draft NI2S3 and technologies adopted, it is appropriate to mention the challenge that is intended to address.

Despite of the NEC clear decision to adopt SOA for the general middleware for such information Systems there is still a lot of work to do in consolidating this approach and considering as this, the project assumes that the control systems of CIs are themselves critical information systems. The concern for the protection of CII is therefore crucial in the solution provided.

NI2S3 basic architecture proposal

The first topic addressed, during the research, was how to respect the following mandatory constraints highlighted in the study of CI (WP2):

- integrability,
- interoperability,
- security and reliability of information,
- high availability of the information infrastructure on which depends CI availability,
- process information in a timely (near real time) in order to improve the early warning management.

Constraints from 1 to 4, have to be addressed taking into account the NEC principles of the 'Shared Information Space' where information should be accessible by participant using services. The term 'participant' means 'any algorithmic actor' regardless software or hardware platform. In consequence of that principle, the SOA solution should be software neutral. This goal can be reached adopting standard and open protocol to distribute information toward participants. Focusing as major example to the SCADA environment and its opening to share information at higher levels the NI2S3 solution is based on:

(a) OASIS WS notification to distribute data with improved level of awareness, (cognitive domain);
(b) object linking and embedding for process control - unified architecture (OPC-UA) protocol to gathering data coming from sensors, (physical domain).

The WS notification, based on SOAP protocols, implement the publish/subscribe communication paradigm. The publish / subscribe communication paradigm fulfils the real-time requirement and it is widely used in the simulation domain where real-time communication (RTI) have to be effective.

The choice of OPC-UA protocol has been proposed in NI2S3 because it fulfil the requirements listed before in the world of SCADA. The current OPC specifications defines standard objects, methods, and properties for servers of real-time information like distributed process systems, programmable logic controllers, smart field devices and analysers in order to communicate the information that such servers contain to standard OLE / COM compliant technologies enabled devices. The unified architecture (UA) is the next generation OPC standard that provides a cohesive, secure and reliable cross platform framework for access to real time and historical data and events.

Demonstrator architecture elements

The concepts developed in research NI2S3 have been proven implementing a prototype of a control and supervision system of an highway.

The SOA architectural pattern has been realised by adopting an open source ESB (ServiceMix 3.5) and implementing the OASIS WS notification protocol. This solution guarantees the compliance with the constraints of interoperability, integration, and platform neutrality with respect to the SW / HW.

The presentation layer is built on the open source framework Liferay for web portal solutions.

The layer of collection and acquisition of data from the sensor network was developed to be fully compliant with the specifications of the open protocol OPC UA.

The control network of sensors (SCADA level) is achieved by supporting the MODBUS protocol.

The entire highway network of sensors has been simulated by referring to market products currently used by the Italian highway 'A22 del Brennero'. The physical behaviour of each sensor has been simulated up to the communication MODBUS. The simulation platform also offers the programming capability that allows to design the behaviour of each individual sensor, allowing the simulation of critical scenarios.

To secure communications and to fulfil the security requirements identified in D2.2Blad! Nie mozna odnalezc zródla odwolania, the NI2S3 demonstrator integrates a couple of XML gateway devices. These devices allow to provide several kinds of security policies and support the most important communication standard for network monitoring and network security.

In order to protect the network of the CI information system, the NI2S3 SOA solution implements a mechanism able to notify an alarm when cyber attack is detect. To improve security in data acquisition NI2S3 demonstrator provide a specialised firewall for the widely used SCADA protocol (MODBUS), able to detect cyber attack and to forward alarm to the network monitoring management station via SNMP trap.

So, in this way the demonstrator protected the critical boundaries of communication among components running in a distributed environment including technologies born in a closed environment (like MODBUS - SCADA) and enforced the protection towards public networks adopted to transfer critical data in the network (the study does not address the security of public network itself managed in our assumption as a trusted and reliable one eventually subject to similar approach for its security and availability as a separate system)

Data fusion

The primary objective of 'data fusion' in NI2S3 is to increase awareness on the current state of the CI. It is practically impossible for a human interlocutor, to analyse simultaneously all the events captured by the system, and to correlate them to assess in advance the possible states of alert.

Data fusion is generally defined as the use of techniques that combine data from multiple sources and gather that information in order to achieve inferences, which will be more efficient and potentially more accurate, than if they were achieved by means of a single source. An important task of data fusion is also efficient and accurate threat finding. The fusion process can be defined as putting together information obtained from many heterogeneous sensors, on many platforms, into a single composite picture of the environment.

Data fusion is not a goal in itself, but a means to obtain certain behaviour by an agent. In order to plan and execute actions, an intelligent agent must performing reasoning about its environment. For this, the agent must have a description of the environment. This description is provided by fusing 'perceptions' from different sensing organs obtained at different times.

From the technological point of view the ability to anticipate states of alert is provided by a correlation engine (CE) capable of processing simultaneously and in real-time dozens of events. The research activities carried out in the project NI2S3, led to the selection of new techniques not based on traditional database, but on new strategies for data persistence that can meet the requirement of real-time response. Was also successfully tested a new technique that allows multidimensional correlation for evaluating each individual event from the perspective of time, the spatial perspective and the source of origin. In other words, the assessment of the situation involves the evaluation of the dimension of time: past and present, the geographical dimension: the location of the sensor that captures the information, the type of measurements acquired.

In our case we are dealing with data fusion coming from the sensor. Sensor fusion is the combining of sensory data or data derived from sensory data from disparate sources such that the resulting information is in some sense better than would be possible when these sources were used individually. The term 'better' in this case can mean more accurate, more complete, or more dependable, or refer to the result of an emerging view, such as stereoscopic.

The data sources for a fusion process are not specified to originate from identical sensors. One can distinguish 'direct fusion', 'indirect fusion' and fusion of the outputs of the former two. Direct fusion is the fusion of sensor data from a set of heterogeneous or homogeneous sensors, soft sensors, and history values of sensor data, while indirect fusion uses information sources like a priori knowledge about the environment and human input. Both data and sensor fusion should not be considered as a universal method. Data fusion is a good idea, provided the data are of reasonably good quality.

Fusion methods can be based on:

(a) probabilistic and statistical models such as Bayesian reasoning, evidence theory, robust statistics;
(b) least-square (LS) and mean square (MS) methods such as Kalman filtr (KF), optimisation, or regularisation;
(c) other heuristic methods such as artificial neural networks (ANNs), fuzzy logic, approximate reasoning and computer vision.

Used in the Data Fusion process algorithms and applications have a high complexity and functionality. Utilised an innovative architecture allows to achieve high performance computing and provides high efficiency correlation. The key achievements CE is:

- multi-spectral cooperative segmentation:
(a) images;
(b) algorithmic approach (compressed sensing, qualitative probabilistic networks, Bayesian logic, multi-spectral cooperative segmentation);
(c) mobile agent (Network traffic, AI approach);
- compressed sensing (algebraic approach).
- complex event processing (Universal approach):
(a) event-pattern detection;
(b) event abstraction;
(c)modelling event hierarchies;
(d) detecting relationships between events (causality, membership, timing);
(e) abstracting event-driven processes;
(f) business activity monitoring.

Data fusion engine concept

The entire logical cycle of data fusion, whose architecture has been shown above, can be reduced to five essential parts: pre-processing, classification, normalisation, aggregation and correlation.

Let us see now at how the data fusion phases are practically implemented in the structure of our CE. We are dealing with two auxiliary phases: data enrichment and notification during the process. Architecture of the whole process of data fusion is shown in Figure 6. The following are also briefly described its stages.

(1) Preprocessing:
Event decomposition tree
(a) shows distinct paths between different event details in same event type;
(b) minimal tree walk path to process whole information;
(c) reduces processing time from O(n) to O(log(n)) .

(2) Classification:
(a) trivial (source type (rule) -> event class);
(b) non-trivial (multiple fields (rule) -> event class).

(3) Data enrichment:
(a) depends on event source type;
(b) extends event map on new fields with human readable explanation from external sources knowledge;
(c) does not modify existing fields;
(d) enrichment mechanisms:
(i) dynamic data dictionaries (real-time gathered data from external sources i.e. DNS evaluation);
(ii) static data dictionaries (i.e. cached data from external LDAP or databases).

(4) Normalisation:
(a) event class;
(b) generalised structure holding data about distinct event types;
(c) each class has different but not changeable field structure;
(d) classes are independent from event sources (i.e. two authentication events from different sources with different fields will have same representation in authentication class);
(e) events from different sources may belong to same class;
(f) events from one source may belong to many classes;
(g) each event is unified to one or more several event classes;
(h) normalisation based on 3 attribute types:
(i) data source type;
(ii) pre-processing identifier;
(iii) destination event class.

(5) Aggregation:
(a) types;
(b) match (rule condition event triggering);
(c) threshold (event is triggered when count of events meets threshold in sliding window);
(d) collector (event is triggered every time period);
(e) aggregation rules should contain:
(i) aggregation sliding window period;
(ii) aggregation attributes;
(iii) grouping attributes;
(iv) aggregated event creation mode;
(v) possible aggregation functions which might be used: first(), last(), concatenate(), distinct(), count(), min(), max(), avg().

(6) Correlation:
(a) aggregated events;
(b) aggregated streams;
(c) normalised events;
(d) normalised streams.

(7) Notifications:
(a) notification is process of passing internal events from event processing engine to outside;
(b) each notification is defined by Notification Rule that contains:
(i) defined filters on events fields;
(ii) delivery notification (notification service) and it's configuration;
(iii) defined message content: message text, topic, priority and receivers;
(iv) notification services:
- simple mail transfer protocol (SMTP) service,
- database service - store messages in database,
- administrator console user - contains receiver, topic and message text,
- ESB - contains topic where notification should be publish,
- topic JMS Publisher - pretty same as ESB.

Rule driven approach

In the process of effective data fusion used the Rete algorithm. Applied Rete methodology is an efficient pattern matching algorithm for implementing production rule systems. System builds a network of nodes. The structure of the algorithm is composed of several key elements, such as: root node (system event input), object type node (first processing node, events can propagate from here to alpha nodes, left input node adapter or beta nodes), Alpha Node (used to evaluate literal conditions, without internal memory), beta node (comparators: JOIN and NOT operations on whole events, with internal memory) and Input Adapter (can be external event input). The principle of action Rete-based algorithm shows figure below.

Algorithm has the following features:
-Nodes corresponds to a pattern occurring in the left-hand-side (the condition part) of a rule.
- The path from the root node to a left node defines a complete rule left-hand side.
- Each node has a memory of facts which satisfy that pattern.
- Asserted or modified facts propagate along the network, causing nodes to be annotated when that fact matches that pattern.
- When a fact or combination of facts causes all of the patterns for a given rule to be satisfied, a leaf node is reached and the corresponding rule is triggered.

The Rete algorithm is designed to sacrifice memory for increased speed. An important feature of the algorithm is the fact that its performance is essentially independent of the number of rules in the system.

In our case, the correlation relates to CI in the field of transport. So we have to deal with road traffic. Correlation rules are thus forecast (amount of rainfall) and the number of cars passing per unit time (a priori knowledge and present). Diagram of the various phases depending on the data fusion process and the relevant rules of correlation can be represented.

Potential impact:

The scope of the research is mainly focusing on a general support of ICT to new model of command and control scenarios which extend the area of interest from the pure physical and information control to a cognitive level of application.

This is in line with the evolution of the system and to the application of new scenarios for Security and protection which from one side will extend its capabilities to a wider field of applications (now technologies are mostly advanced in military environment) and from another will fully exploit the functionalities and capabilities of the distributed systems integrated by a wide network.

Today command and control technologies are mainly adopted in closed and separated environment so for example energy station (in an enterprise or in an service environment like an airport) are controlled in their functionality and able to switch the power sources in relation to external failures or internal problems but are not able to communicate data and collaborate with different layers or applications to contribute to a real time planning of energy distribution or even to perform an adaptive energy costs.

Another example coming from homeland security is that today is possible to apply surveillance system to monitor sensible space but still this is mainly done by human monitoring and not correlating events coming from different sources (i.e. an acoustic or chemical sensor combined with the image based detection).

Every system is working well in his environment but it should be not applicable to extend the related scenarios directly controlling each separate system and doing human control to aggregate information and set corrective or proactive actions.

The technological evolution of this environments is ready to switch from a passive rendering of data in closed environments each one dealing with its own protocols, data and monitoring features to a more effective collaboration and integration at higher level where data can be translated in objects, facts, generating events and even suggestions to apply. The paradigm of 'Internet of things' is made possible by the wide communication capabilities now available (wireless or wired on several media), by the increasing distributed elaboration capabilities and the availability of new protocols (i.e. IPv6). This enables to connect with an unique language disparate elements in the wide network and extracting from them data ready to be categorised, analysed and aggregated and automatically elaborated in a space-temporal congruence.

The effort to enable the new scenario is focusing on one side on protocols and communications, on the other side on aggregation and analysis capabilities to be centrally performed.

The overall outcome of this effort would be applied to several environments all of them commonly understood as critical and needed of improvement first of all homeland security, energy, environment which are the main directions where social scenarios are presently focusing to get a more efficient (so cheaper) and sustainable set of services.

The present research is focusing on identifying which are the best methodologies to approach and design these type of systems (in a wide sense) and what should considered the attention points and the base technologies in deploying the systems in a real ICT implementation.

The outcomes of this project represent a contribution to the above scenarios and has been already disseminated (agganciare una descrizione).

Exploitation is strictly linked to the evolution of the big picture in services availability mentioned above. The proposed model of CRAF can be a reference for architect and design the systems and can be immediately applied to feasibility study in innovative sectors like advanced security in cities and in the environment and in participating to the overall energy distribution model derived by smart grid technology and their hierarchical implementation to the micro grid level.

This is also driving the interest of the participants in the consortium to the outcomes of the project to consolidate their experience and be ready to play a significant role in the mentioned sectors.

Socio-economic background has been widely described and analysed for the named sectors which are also recognised as priorities in the European Union (EU) programmes (FP7).

The aim of the project is not to provide a specific solution for a sector or to build an application but modelling a framework in terms of methodologies and technologies to be reused in commercial projects or in other research programmes thematically defined on specific subject.

List of websites: http://ni2s3-project.eu/

Contact details: Project Coordinator: Walter Matta
Phone: +39-068-8202567
Mobile: +39-335-7716488
Fax: +39-068-8202288
Email: w.matta@vitrociset.it; ricerca@vitrociset.it

ni2s3-publishable-summary.pdf

Final Report Summary - NI2S3 (Net information integration services for security systems)

Related documents

Share this page

Download