During the period covered by the present report the project consortium has carried out important progress, obtaining significant results. From a technical perspective, the ambition of FANDANGO solution is to create a structured framework to support multiple modalities’ ingestion. Therefore, efforts in dedicated Work Package (WP2) were devoted to define an adequate data model and conventions to relate the incoming data depending on its nature (that it can be, articles containing text, metadata, associated media, claims, claims review and data descriptors). As a result, FANDANGO has provided a data model which is interoperable with State-of-the-art standards for integrity of the information. Details on the data model structure are provided in Deliverable D2.2 and employed in FANDANGO implementation. Simultaneously, technical partners action performed an extensive survey of the most appropriate tools for data ingestion including big data frameworks such as HortonWorks and Cloudera (now fused) to guarantee the proper workflow, overcoming potential issues associated with huge amounts of data ingestion, such as data storms. Then, an iterative process was carried out with the support of end-users (CIVIO, ANSA and VRT) to structure the multiple data inputs, preprocessing the information ingested, filtering non relevant information, preparing this data for its posterior analysis.
FANDANGO is working to devise and improve tools expected to achieve the important objective to provide higher level automatic decision making for fake news detection. Different machine learning algorithms have being developed, tested and improved. All of them share a Big Data approach and each tries to evaluate a specific clue to help evaluate the level of trustworthiness of a specific news (fakeness indicators). In parallel important work has been done in order to investigate traceability of image manipulation. The use of Deep Learning approaches to detect the trustworthy of an image, and based on its metadata track the path to the source provides a manner to back trace a news through its associated media files. Another approach has been provided by matching titles or content of the publications. In both scenarios, users will be able to visualize it through FANDANGO advanced data investigation tool, developed by Siren, that allows the connection to be visualized in an intuitive manner by using sophisticated graph techniques to associate related information and create a timeline of events based on collected publications. This approach should enable users to navigate across the temporal propagation of the information directly through the UI, whenever it is possible to trace connections between the content either by media distribution or the social network analysis.
A specific attention has been given in this period to the testing and validating of the system through pilots with important achivemnets and dedicated work performed. The use cases for the pilots have been defined in a specific deliverable, as well as the metrics to be measured during the pilots. The next step that is foreseen, is a collection of these metrics during manual fact checking, as well as during fact checking by means of FANDANGO 0.4. These metrics will then be compared in order to assess FANDANGO benefits & usability of results. During pilot 1, a small group of journalists from ANSA, Civio & VRT will be testing the first functional iteration of FANDANGO. During pilot 2, the focus will be on a bigger group of users and on a revised version of FANDANGO, after first pilot validation results. On top of that, the consortium will look for journalists of media companies outside the consortium to test FANDANGO as well (= field test).