Skip to main content
CORDIS - Forschungsergebnisse der EU
CORDIS

Sharing and Automation for Privacy Preserving Attack Neutralization

Periodic Reporting for period 2 - SAPPAN (Sharing and Automation for Privacy Preserving Attack Neutralization)

Berichtszeitraum: 2020-11-01 bis 2022-04-30

Many European organisations often struggle to handle modern cyberattacks effectively due to a lack of solutions for accurate detection and timely response. The overall objectives of SAPPAN aim to enable organisations to protect their ICT infrastructures against cyberattacks by improving:
1. Local and federated threat detection
2. Privacy-preserving threat intelligence sharing and response automation
3. Interpretability of threat intelligence with advanced visualisation

Different end-users, such as human analysts in Security Operation Centres (SOCs) will benefit from SAPPAN innovations. The SAPPAN project has developed and demonstrated several high-impact results, some of which have been integrated into commercial partners' products. Finally, SAPPAN's collaborative threat intelligence sharing approach and its open-source results can be adapted for further exploitation in the SOC solutions of European public institutions, SMEs, and companies to improve organisation-specific incident management capabilities.
In the early stages of the project, the consortium has identified high-level response and recovery use cases, from which requirements for data sharing, privacy and visualisation have been derived. These insights were used to define the SAPPAN architecture and an evaluation methodology for the SAPPAN project.

Based on this groundwork, the following work has been performed:

1. Local attack detection: 
 - Research on fast and scalable data processing pipeline
 - Proof of concept implementation for syslog and netflow processing
 - Creation of two publicly available data sets
 - Research on high-precision deep learning classifiers for DGA and phishing detection
 - Development of an approach for large-scale endpoint behaviour profiling, analysis and anomaly detection based on network, endpoint and application behaviour
 - Proposal of approaches to combine network and endpoint data
 - Graph-based improvement of network forensic approaches
 - Development of a privacy preserving approach to URL abstraction
 - Development of visualisation tools for data analysis and the development of attack detection algorithms
 
2. Management and automation of threat intelligence: 
- Creation of a formal methodology and vocabulary for modelling cybersecurity playbooks
- Development of a tool to capture, edit and visualise respective playbooks
- Development of incident similarity and clustering methods for security incidents
- Development of a contextual attack chain modelling method to predict future adversarial actions
- Development of a mechanism for automated recommendation for DDoS mitigation rules
- Proof of concept implementation of automated response to DGA incidents
- Development of a malware analysis platform
- Integration of analytical provenance tracking and analysis into the SAPPAN dashboard

3. Federated threat detection and response:
- Comprehensive study on collaboratively trained models
- Implementation of collaborative learning over MISP
- Demonstration of its feasibility and impact on false positive reduction
- Collaboration with the technical committee of CACAO on a standardised playbook format in MISP
- Development of a visual analysis system for federated learning

A reconfigurable card-based dashboard using Elasticsearch has been designed and implemented to support SOC interaction with the SAPPAN tools. The different tools have further been integrated into the SAPPAN demonstrator to enable evaluation and assessment by end-users. The SAPPAN results have been evaluated in relevant environments by measuring both technical and user-centric KPIs.

The project resulted in over 20 peer-reviewed scientific papers with presentations at various events. The consortium also co-hosted the ARES NG-SOC workshops and hosted the SAPPAN final event. A total of 34 exploitable results have been generated during the project.
Local anomaly and intrusion detection:
Research has been performed within the four use-cases of DGA detection, phishing detection, host and application profiling, and anomaly detection. For DGA and phishing detection, training deep neural networks (DNNs) on domains and full URLs has been evaluated, leading to higher precision than for contemporary approaches. In the case of DGA detection, this even allowed to identify the malware families generating a given algorithmically generated domain. This effort has been extended by a visualisation tool, which assists in the development and training of DNNs has been developed, providing explanations for processes within the DNN. Further, the tool for large-scale behavioural host profiling has been developed including a tool for visual exploration of these profiles. In the context of endpoint security, a novel detection model for the identification of malicious macros has been developed. Last, but not least, a novel approach for network forensics has been explored.

Privacy preserving anomaly and intrusion detection:
An approach for URL sanitisation has been developed and evaluated in regard to its impact on model prediction for DGA and phishing detection classifiers. In the context of federated threat detection, the privacy threats implied by various collaborative machine learning approaches have been analysed. Further, threats of exposure of locally private data sets or models have been investigated through the latest inference attacks.

Federated threat detection:
A comprehensive study on collaboratively trained models based on different types of sharing has been conducted and related privacy threats have been analysed. Collaborative machine learning over MISP has been implemented and its feasibility has been demonstrated, leading to a significant reduction in false positive rates by up to 51.7% and improved detection performance.

Automating and sharing incident response handling information:
Based on incident similarity functions defined in the project, a method for clustering has been developed for security incidents detected at endpoints. This method is already in use by FSC's SOC and appreciated by security analysts. It is used for guiding the analysts in selecting incident response actions, for improving and fine-tuning attack detection engines, and for examining the evolution of incident clusters over time. Further, its use for automating incident response handling has been considered. The approach is expected to generalise to various systems dealing with security incidents.
For sharing incident handling information, a formal representation of respective actions is crucial. SAPPAN cooperated with the Technical Committee of the OASIS standardisation effort CACAO, and introduced a MISP data model for cybersecurity playbooks following the CACAO specification. The model is publicly available in the official MISP repository to enable sharing of incident response handling in the cybersecurity community.

Visualisation:
Two visual analytics tools for better interpretability of deep learning models have been implemented, one of which is intended for the application in federated learning scenarios. Further, an online study has been conducted, including the collection of interaction data, to assess the user difference for a complex visual analytics system for the evaluation of deep learning models. Interpretability via visualisation tools and user interaction with them is a crucial research topic in a world looking to utilise artificial intelligence models across different application domains.
SAPPAN Flyer