The first 30 months of STARLIGHT have been devoted to collect needs, analyse current LEAs’ practices, and determine the gap faced towards the adoption of AI-based solutions to refine the challenges, limitations, needs and priorities within the crime areas to be covered by the project.
6 Use Cases are addressed (broken down into 12 scenarios): Counter-terrorism, Child Sexual Exploitation, Border & External Security, Cybersecurity & Cyber threat intelligence, Addressing information overload in Serious Organised Crime (SOC) and Protection of public spaces. They are the result of a joint analysis of operational scenarios, needs and desired functionalities of the concerned stakeholders’ community.
A focus was made on identifying the main legal and ethical frameworks that are applicable to STARLIGHT, as well as societal concerns that may arise from its implementation. A comparative study was conducted on various national legal frameworks on data handling and access challenges that may have an impact on the data sharing within the STARLIGHT partners for R&D purposes.
A first action was to identify available data-oriented tools (collection, annotation, anonymisation and generation) and datasets and to assess their maturity. At M30, it represents 18 new tools and more than 130 datasets of which 23 are new. More tools are now ready to be shared with LEAs. Different data modalities (image, audio, text, video) are processed. Privacy-friendly data processing strategy is at the heart of the development. In addition, quality assessment is taken into account to identify possible biases in the distribution of the data or under-represented classes. The use of these tools was then prioritised according to their maturity and the functionality required for the selected use cases and scenarios.
To speed up the process, it was decided to run parallel co-creation and co-development cycles to tackle specific sub-scenarios. These 6-month co-development cycles are jointly led by a LEA and a technical partner. A 3rd set of CODEV cycles is currently running and a 4th one should start in Sept 2024. 4 hackathons already took place for a first evaluation of these new tools along with a first Pilot demo in an operational environment. “Lessons learned” from these events will feed the following cycles.
Understanding AI vulnerabilities is paramount and preliminary work has been done to study the latest threat intelligence strategies and to establish a basic methodology for self-assessment of risk to anticipate, predict and investigate cyber-attacks.
Finally, a repository was designed, implemented and deployed to enable the sharing of datasets and tools developed by technical partners. An orchestration framework has been developed to facilitate the integration of STARLIGHT tools into operational pipelines through a unified approach. At this stage, the repository is ready for use by LEAs and a first integrated version of STARLIGHT Framework is ready.