In the first year all participants were engaged in establishing a unified vision and agreeing upon shared methodologies, crucial for developing technical tools but also for identifying, collecting, and producing ground truth data tailored for AI tool training within this project.
Moreover, this work addressed one of the project's major challenges: accessing social media data. Changes in policy for research access to Twitter/X API and the gradual discontinuation of Crowdtangle by Facebook/Meta necessitated diversifying data sources beyond the original plan. Consequently, alongside the continued access to Youtube, a novel methodology for collecting Telegram data was devised, customized to the topics outlined. Also, the progress associated with the Social Media Data Streams will be included in the upcoming phase.
This exploration of new data sources offers a fresh scientific perspective to be further explored, where we are currently evaluating novel applications of social network analysis utilizing perspectives distinct from those developed for more widely studied platforms like Twitter. Naturally, ensuring privacy compliance has been paramount in our work. All teams collaborated closely to uphold privacy by design principles in our data management and scientific activities. This was further reinforced through continuous alignment aimed at striking the correct balance between the requirements for scientific insight and explainability, and the minimization of personal information collected.
From a scientific perspective, during the reporting period, AI-driven methods were developed for the analysis of textual, speech, audio, visual, and multimodal data. To facilitate integration with the Disinformation Warning System and the AI4TRUST platform, technology-providing partners exposed accessible APIs. Efforts in understanding the output of data analysis technologies led to the design of a system architecture for producing an overall estimate of the trustworthiness of a given claim or media item. Furthermore, Socio-behavioral and human-centered research has been conducted to inform the development of AI4TRUST models and the overall platform design and aim. To this end, focus groups, interviews, and ethnographic fieldwork have been conducted to inform the requirements for human-AI interaction, specifically focusing on aspects such as trust and explainability, as well as to map the specific requirements different future users of the AI4TRUST platform have.
From a technical perspective, we worked toward design and development of the AI4TRUST platform, suitable solutions are being provisioned to ensure ethical and privacy compliance. The technical design and implementation of the platform adopt a microservice-based approach to facilitate integration of different modules implemented throughout the project and enhance support for concurrent data processing and analysis models. Ethical, privacy, and data protection requirements were taken into account during this process. Following the consolidation of platform specifications, development kicked off alongside the integration of partner tools. A preliminary version of the backend and frontend, encompassing a subset of features, is slated for the pilot phase.
To facilitate the alignment and experimentation of pilots, we compiled a comprehensive collection of state-of-the-art fact-checking methodologies and best practices sourced directly from media professionals within the Consortium. These insights informed tailored high-level requirements for each pilot, ensuring alignment with their unique needs. Additionally, we established a structured work plan and guidelines for the piloting sessions, delineating activities across preparatory, execution, and post-pilot phases. For evaluating the pilots, we defined evaluation objectives, KPIs, and feedback mechanisms. To streamline upcoming pilot phases, we've initiated measures such as strategically assigning contact points for each pilot, crucial for maintaining regular updates. Moreover, we've developed a preliminary set of instructional materials outlining actions during training sessions and setting clear expectations for platform performance.