Local anomaly and intrusion detection:
Research has been performed within the four use-cases of DGA detection, phishing detection, host and application profiling, and anomaly detection. For DGA and phishing detection, training deep neural networks (DNNs) on domains and full URLs has been evaluated, leading to higher precision than for contemporary approaches. In the case of DGA detection, this even allowed to identify the malware families generating a given algorithmically generated domain. This effort has been extended by a visualisation tool, which assists in the development and training of DNNs has been developed, providing explanations for processes within the DNN. Further, the tool for large-scale behavioural host profiling has been developed including a tool for visual exploration of these profiles. In the context of endpoint security, a novel detection model for the identification of malicious macros has been developed. Last, but not least, a novel approach for network forensics has been explored.
Privacy preserving anomaly and intrusion detection:
An approach for URL sanitisation has been developed and evaluated in regard to its impact on model prediction for DGA and phishing detection classifiers. In the context of federated threat detection, the privacy threats implied by various collaborative machine learning approaches have been analysed. Further, threats of exposure of locally private data sets or models have been investigated through the latest inference attacks.
Federated threat detection:
A comprehensive study on collaboratively trained models based on different types of sharing has been conducted and related privacy threats have been analysed. Collaborative machine learning over MISP has been implemented and its feasibility has been demonstrated, leading to a significant reduction in false positive rates by up to 51.7% and improved detection performance.
Automating and sharing incident response handling information:
Based on incident similarity functions defined in the project, a method for clustering has been developed for security incidents detected at endpoints. This method is already in use by FSC's SOC and appreciated by security analysts. It is used for guiding the analysts in selecting incident response actions, for improving and fine-tuning attack detection engines, and for examining the evolution of incident clusters over time. Further, its use for automating incident response handling has been considered. The approach is expected to generalise to various systems dealing with security incidents.
For sharing incident handling information, a formal representation of respective actions is crucial. SAPPAN cooperated with the Technical Committee of the OASIS standardisation effort CACAO, and introduced a MISP data model for cybersecurity playbooks following the CACAO specification. The model is publicly available in the official MISP repository to enable sharing of incident response handling in the cybersecurity community.
Visualisation:
Two visual analytics tools for better interpretability of deep learning models have been implemented, one of which is intended for the application in federated learning scenarios. Further, an online study has been conducted, including the collection of interaction data, to assess the user difference for a complex visual analytics system for the evaluation of deep learning models. Interpretability via visualisation tools and user interaction with them is a crucial research topic in a world looking to utilise artificial intelligence models across different application domains.