Next Generation Safer Internet: Technologies to identify digital Child Sexual Abuse Material (CSAM) (RIA)
One of the main challenges in the fight against online CSAM is the vast amount of potential new CSAM that Hotlines and Law Enforcement Agencies (LEAs) have to assess and classify as illegal prior to takedown. In 2018, for example, national LEAs in the EU received more than 500,000 referrals stemming from US internet providers, while INHOPE Hotlines are seeing increasing numbers of reports of CSAM hosted in the EU resulting from proactive search for CSAM. Relying on human analysts alone to assess such vast quantities of material slows up both law enforcement investigations and notice and takedown actions. There is therefore an urgent need to further develop and test AI tools which support the classification of CSAM. Such AI classifier tools will help law enforcement agencies (LEAs), INHOPE Hotlines, and industry to analyse the vast amounts of digital CSAM more efficiently through automated identification and prioritization, thus leading to swifter takedown of illegal material by Hotlines and industry, and more effective investigations by LEAs.
The proposals aim to develop mature tools that support the analytical work of LEAs and Hotlines, based on relevant classifiers that correspond to typical elements/characteristics of CSAM. The tools should allow identification, categorisation and prioritisation of digital CSAM from large data sets. The solutions should be robust enough and provide sufficient information to help Hotline analysts and law enforcement officers in their assessments.
To ensure that the proposed solutions are fit for purpose and effective, INHOPE Hotlines and LEAs should be involved in each project. Working in close cooperation with them, the proposals should build on existing infrastructures and processes already available to LEAs and INHOPE Hotlines. The proposals should ensure European added value through cross-border interoperability.
The proposals should define the characteristics and granularity of classifiers required, develop the classifiers, compose and annotate representative CSAM data sets, train and test the tools in cooperation with LEAs and INHOPE Hotlines. As CSAM is illegal, these data sets need to be provided by or composed mainly in cooperation with LEAs. To reduce the development and training time on this sensitive data, the proposed tools should be able to incorporate dynamically user feedback, preferably without the need of retraining the model. The proposed tools should also allow pre-training on data available for other general tasks, like image classification, object detection, instance segmentation, etc., in order to increase the accuracy and to reduce the exposure to sensitive data during training. The tools to be developed can also include other relevant features such as text-based data analysis, audio analysis from videos and/or automated key word extraction from audio or age detection.
All tools developed throughout the projects should be made freely available as Open Source Software, also for industry to use on a voluntary basis to detect and remove illegal material.
The topic, with its focus on more effective and efficient AI-based tools for processing online CSAM by a wide range of actors (NGOs, industry, Law enforcement), complements the objectives of Horizon Europe Cluster 3 Civil Security for Society[[HORIZON-CL3-2021-FCT-01-11: Prevention of child sexual exploitation]], which advances research into perpetrators and on tools for law enforcement intelligence. Moreover, it will build on relevant work performed in previous EU-funded projects and national initiatives.