Skip to main content

Scalable privacy preserving intelligence analysis for resolving identities

Periodic Reporting for period 1 - SPIRIT (Scalable privacy preserving intelligence analysis for resolving identities)

Reporting period: 2018-08-01 to 2019-07-31

• The problem
Determining identity is one of the key principles in any LEA investigation concerning crime, disorder or terrorism. A range of intelligence sources can be used to answer identity based investigative questions. Essentially at SPIRIT we seek to establish an identity based on focussed investigative questions that would infer:
1. Identity from a name
2. Identity from a physical attribute
3. Confirming an identity overtly and covertly
4. Alias and ‘true’ identity
5. Evidence, contributing to the wider intelligence framework
Often LEAs are starting their investigation with a name, which seems all well and good, but depending on how the name is spelt, there can be numerous permutations. For example Caitlyn Taylor could appear 64 ways and this assumes that there are no spelling issues. When considering less familiar foreign names, the risk of error increases significantly.
Offenders generally leave a trace, whether that's CCTV imagery, voice recordings or a something as simple as a shoe print, the difficulty is using that trace to compare against others to identify a common link and therefore hopefully an identity. Most of this work is often carried out manually.
Once LEAs establish a potential identity the next hurdle is confirming it, in overt investigations, LEAs can take fingerprints and DNA that positively link a subject to an identified profile; where LEAs difficulty increases is within covert investigations where activity often serves just to increase confidence in an identity rather than confirming it. However, what is someone's true identity, the one we know, the one they were born with, the one they identify with, the last one they give us..? This constitutes a fundamental issue behind SPIRIT.
• Society
Local community safety is affected by regional, national and international factors and as such, it's right and appropriate that we utilise relevant information sharing agreements and protocols. The question is, are we sharing the right identity information.
To understand the need for Spirit tools, it's important to be sighted on the complexity of the current law enforcement mission.
Law Enforcement complexity:
• Globalisation
• Advancing Technology
• Altered Identities
• Increasing Demand
• Limited Resources
• Objectives
Through our end user cases we established a set of criteria that the development team needed to deliver to ensure that the Spirit Tools added value to LEAs work.
The tools had to be simple and intuitive to use, as this reduces training abstraction costs.
The tools have to comply with the highest ethical standards.
LEAs didn’t want to duplicate effort, so the tool had to be able to draw on existing intelligence search tools.
Reliable & Timely: The system needs to be robust and reliable as there may not be any post delivery support. It also needs to return results quickly it it’s to be of value to operational teams
Auditable: As part of the ethics and integrity build, the system must have an audit trail to ensure that it is not being misused.
Secure: Data protection, integrity and security are imperative to law enforcement; the system therefore has to comply to high security standards
At milestone MS1: SPIRIT has delivered analytic end user requirements and use case specifications extracted from thorough analysis tasks that have been carried out of WP 2 activities. The respective description of the above effort was included in the ‘Requirements Analysis’ document (D7). Based on this work SPIRIT further delivered the ‘System Design’ document (D10) whereby containing the detailed set of functional and non-functional requirement specifications for the first SPIRIT prototype demonstrator. The aforementioned work has been progressing a universal use case scenario, endorsed by SPIRIT LEAs, thus allowing setting a first year SPIRIT prototype evaluation metric. Thus, the consortium successfully met MS1.
At milestone MS2: SPIRIT reports the successful implementation of the first year prototype, tested both in the lab and at a specially designed Training and Evaluation session during the annual project plenary meeting. This prototype at M12 has been deployed at the defined and supplied SPIRIT infrastructure equipment and can be exhibited during the first project review.
The definition of the SPIRIT investigation job produced a system architecture that involves the concept of ‘multi-purpose semantic crawling’. During Y1 a baseline set of crawling features have been designed and implemented in line with the Y1 Actions plan.
SPIRIT has defined the baseline for developing a graph based evidential structuring process. During Y1 graph based storage and storage mediation services have been defined and prototyped, both for Policing Data visual analytics (SPIRIT graph visualization rapid prototype) and for surface web dynamic investigation jobs visual analytics.
SPIRIT has achieved to define a set of fundamental job investigation services that would allow during the next project steps to combine graphs in a knowledge-full scalable, privacy preserving and effective manner in line with end user expectations.

Design of a preliminary Identity model and of the relevant attributes (to be further considered) during the identity resolution process. A first version has been discussed and finalised.

A conceptual model has been created and presented at project meetings, representing physical, virtual and formal identity attributes.

Identity attributes have been chosen for crawling and entity extraction in the universal use case senario and presentation of the first prototype.

Entities to be extracted by NLP service, Image processing, video processing and voice processing services have been finalised and discussed during technical meetings.

The format for recording the extracted entities has been discussed.

Last but not least for Y1, a strategy on how SPIRIT’s identity resolution service could access to the extracted entities, provided by peer services, has been the object of discussion among technical partners.

Nonetheless a first draft of identity resolution service has been developed and dockerized. In this version, an active learning method has been used to create a neural network for resolving an identity in a dataset.

Training on the SPIRIT prototype has been designed and a formal Training Evaluation Session already took place at the annual project plenary.
SPIRIT has invested a lot of effort in security, ethics and privacy preserving tasks. During the project Y1 a large number of Deliverables were developed as well as the following clarifications were provided with respect to the inquiries of the CEC ethics panel: terms of engagement with Dark Web tasks, updates of the Data Protection Impact Assessment, Legal framework for the Call for test and demonstration data, Update of the Incidentals Findings and Risks Policy (SPIRIT organised a special Workshop in Oxford June 2019), LEAs Compliance and establishment of a supervision process on behalf of an external Ethics Advisors Board.
As far as the involvement of the SPIRIT end users is concerned, a meticulous process has been organised during activities of WP8 as well as activities of the integration and demonstration tasks (WP7). In accordance with the agreed set of Y1 prototype functionality end users were asked to validate the developed ecosystem. The metrics, questionnaires and setting configuration was agreed in synergy with the technical partners.
First Year Prototype SPIRIT Demonstrator