CORDIS - Risultati della ricerca dell’UE
CORDIS

FAke News discovery and propagation from big Data ANalysis and artificial intelliGence Operations

Periodic Reporting for period 2 - FANDANGO (FAke News discovery and propagation from big Data ANalysis and artificial intelliGence Operations)

Periodo di rendicontazione: 2019-07-01 al 2021-03-31

FANDANGO general overarching objective is to fight the misinformation phenomenon and increase the level of trust that professional users in the first place, and as a consequence citizens and the society as a whole, can place on the trustworthiness of news.
Since truth is no longer dictated by authorities, but is networked by peers, for every fact there is one or more counterfacts. All those counterfacts as well as facts may look identical online, leading in confusing most of the people.
It is a bewildering maze of claims and counterclaims, where hoaxes spread in great speed, in particular on social media, and spark angry backlashes from people who take what they read at face value. All the above, shows that a new way to decide what is trustworthy is needed, FANDANGO will seek to answer through the use of Big Data and Artificial Intelligence (AI) techniques in order to define our eras new shape of truth. Having a large number of people in a society who are misinformed is absolutely devastating and extremely difficult to cope with. Some warn that misinformation threatens the democratic process itself. On page one of any political science textbook it will argue that democracy relies on people being informed about the issues so they can have a debate and make a decision.
The possibility to increase access to data for journalists and citizens can give an efficient support in tackling this problem and in helping a better usage of new digital information channels. European tradition in democracy, journalism and transparency should play a wordwide example in fast changing society, where all citizens appears completely overwhelmed by the new technologies and by the new social challenges.
The FANDANGO project aims to break this barriers providing unified techniques and an integrated big data platform to support traditional media industries to face the new “data” economy with a better transparency to the citizens under a Responsible, Research and Innovation (RRI) prism.
During the period covered by the present report the project consortium has carried out important progress, obtaining significant results. From a technical perspective, the ambition of FANDANGO solution is to create a structured framework to support multiple modalities’ ingestion. Therefore, efforts in dedicated Work Package (WP2) were devoted to define an adequate data model and conventions to relate the incoming data depending on its nature (that it can be, articles containing text, metadata, associated media, claims, claims review and data descriptors). As a result, FANDANGO has provided a data model which is interoperable with State-of-the-art standards for integrity of the information. Details on the data model structure are provided in Deliverable D2.2 and employed in FANDANGO implementation. Simultaneously, technical partners action performed an extensive survey of the most appropriate tools for data ingestion including big data frameworks such as HortonWorks and Cloudera (now fused) to guarantee the proper workflow, overcoming potential issues associated with huge amounts of data ingestion, such as data storms. Then, an iterative process was carried out with the support of end-users (CIVIO, ANSA and VRT) to structure the multiple data inputs, preprocessing the information ingested, filtering non relevant information, preparing this data for its posterior analysis.

FANDANGO is working to devise and improve tools expected to achieve the important objective to provide higher level automatic decision making for fake news detection. Different machine learning algorithms have being developed, tested and improved. All of them share a Big Data approach and each tries to evaluate a specific clue to help evaluate the level of trustworthiness of a specific news (fakeness indicators). In parallel important work has been done in order to investigate traceability of image manipulation. The use of Deep Learning approaches to detect the trustworthy of an image, and based on its metadata track the path to the source provides a manner to back trace a news through its associated media files. Another approach has been provided by matching titles or content of the publications. In both scenarios, users will be able to visualize it through FANDANGO advanced data investigation tool, developed by Siren, that allows the connection to be visualized in an intuitive manner by using sophisticated graph techniques to associate related information and create a timeline of events based on collected publications. This approach should enable users to navigate across the temporal propagation of the information directly through the UI, whenever it is possible to trace connections between the content either by media distribution or the social network analysis.

A specific attention has been given in this period to the testing and validating of the system through pilots with important achivemnets and dedicated work performed. The use cases for the pilots have been defined in a specific deliverable, as well as the metrics to be measured during the pilots. The next step that is foreseen, is a collection of these metrics during manual fact checking, as well as during fact checking by means of FANDANGO 0.4. These metrics will then be compared in order to assess FANDANGO benefits & usability of results. During pilot 1, a small group of journalists from ANSA, Civio & VRT will be testing the first functional iteration of FANDANGO. During pilot 2, the focus will be on a bigger group of users and on a revised version of FANDANGO, after first pilot validation results. On top of that, the consortium will look for journalists of media companies outside the consortium to test FANDANGO as well (= field test).
Fake News are now a hot issue in Europe as well as worldwide, particularly referred to Political and Social Challenges that reflect in business as well as in industry. Europe is lacking of a systematic knowledge and data transfer across
organizations to address the aggressive emergence of the well-known problem of fake news and post-truth effect. The possibility to use cross sector Big Data management and analytics, along with an effective interoperability scheme for all our data sources, will tackle this urgent problem, generating new business and societal impacts involving several stakeholders: a) Media Companies: news agencies, broadcaster, newspapers, etc, b) Governmental institutions and organisations, c) The overall industrial ecosystem, d) The entire society.
Progress beyond the state of the art will be in aggregating and verifing different typologies of news data, media sources, social media, open data, so as to detect fake news and provide a more efficient and verified communication for all European citizens. European tradition in democracy, journalism and transparency should play a wordwide example in fast changing society, where all citizens appears completely overwhelmed by the new technologies and by the new social challenges.
Specifically FANDANGO will advace scientific state of the art in Machine Learning and collaborative algorithms to detect fake news, in Content-based analysis to detect misinformation, in text analysis, including Natural Language Processing in a multi-lingual environment and, finally, in image and video analysis based algorithms.
FANDANGO Streaming process
FANDANGO Real time process