FARE_AUDIT: Fake News Recommendations - an Auditing System of Differential Tracking and Search Engine Results | HORIZON | CORDIS

Informacje na temat projektu

FARE_AUDIT

Identyfikator umowy o grant: 101100653

DOI

10.3030/101100653

Projekt został zamknięty

Data podpisania przez KE 27 Listopada 2022

Data rozpoczęcia 1 Grudnia 2022

Data zakończenia 31 Maja 2024

Finansowanie w ramach

European Research Council (ERC)

Koszt całkowity

Brak danych

Wkład UE

€ 150 000,00

Koordynowany przez

LABORATORIO DE INSTRUMENTACAO E FISICA EXPERIMENTAL DE PARTICULAS LIP
Portugal

Ten projekt został przedstawiony w…

Periodic Reporting for period 1 - FARE_AUDIT (FARE_AUDIT: Fake News Recommendations - an Auditing System of Differential Tracking and Search Engine Results)

Okres sprawozdawczy: 2022-12-01 do 2024-05-31

The spread of disinformation is a serious problem that impacts social structure and threatens democracies worldwide. Citizens increasingly encounter (dis)information available online, either somewhat passively, through social media feeds, or actively, following search engines’ recommendations and/or by visiting specific websites. In both scenarios, algorithms filter and select displayed information, often according to the users’ past choices. Therefore, if users have a history of consuming even if a little misinformation, there is a real risk that algorithms might reinforce the user’s preferences by offering less divergent views, or even help create (mis)information bubbles, by systematically directing them to low-credibility content. For these reasons, serious efforts have been made to identify and remove “fake-news” websites and minimize the spread of disinformation on social media. However, we have not witnessed equivalent attempts to understand and curtail the potential role(s) of search engines in promoting low credibility information or in increasing polarization. As the recommendation algorithms are typically proprietary, it is not possible to directly evaluate them and we can only infer decisions from the search results.

Thus, the main aim of FARE_AUDIT was to address this imbalance through the development of an unbiased tool to audit search engines, particularly around situations of conflict (political or military), when stakes are high and disinformation rampant. The rationale was to create a system of bots (web crawlers) and incrementally change their features, controlling for factors known to impact search engine results. The bots, made to resemble users from different countries and speaking different languages, visited different websites (including those known to share disinformation) to mimic human online behavior. Through their websurfing, they collected cookies and other “fingerprints”, becoming “profiled”. These profiled bots were then directed to different search engines and instructed to perform the exact same search. By comparing the search engine recommendations, it should be possible to “reverse engineer” the recommendation systems and better understand how browsing history influences those results, particularly the likelihood of being directed to disinformation.

More specifically, FARE_AUDIT’s main goals were to:

1. Develop and implement an unbiased bot-based audit tool;
2. Systematically identify how browsing history influences search-engine results using this system of “web crawlers” that mimics different user profiles;
3. Create and test an online interface that that allows NGO’s, journalists, and interested users to scrutinize search-engine platforms and understand how different profiles access information differently;
4. Extend this concept to novel tracking or search methodologies.

Overall, we expected this tool to have meaningful social impact at at least three different levels: by increasing our knowledge on search-engine personalization, by raising public awareness of the role(s) of search engines on polarization and disinformation spread, and by better equipping civil society organizations with a tool to detect and monitor different ongoing narratives, in close to real-time. Moreover, by relying on web crawlers, our tool is privacy-protecting and does not require any real user data, paving the way to other unbiased audits. In fact, our tool is currently being adapted to include Large Language Models (LLM)-based chatbots (ChatGPT, Gemini, Llama), particularly when integrated with traditional search engines.

The web crawler-based audit system was developed from scratch by our research team and was tested with NGOs and journalists (2) through three pilots: one around the Brazilian Presidential elections (2022), another regarding the ongoing Israel/Palestine conflict (started in 2023), and another around the European Parliamentary elections (2024).

The bots can be increasingly personalized from Step 0 (no browsing history, no cookies, English language, location set to a specific country), to Step 1 (no browsing history, no cookies, language matching location , and location set to a specific country), to a Step “N”, with bots set to specific languages and locations and having a “long history” of visiting specific content, including known disinformation websites (Figure 1). To audit the search-engines, these bots, associated with different user features, were deployed to simultaneously query a search engine, inputting identical queries and collecting the resulting page listings. This process was then repeated across several different queries for 4 search engines. As Figure 2 shows, not only the system worked, it also revealed consistent differences in search-engine recommendations even for the lowest levels of personalization (in the depicted case, asking questions related to the EU Parliamentary Elections of 2024, from different locations but using the English default language).

Regarding the online interface (goal 3), we intended to offer a tool that could be used independently by citizens and NGOs, according to their distinct needs and interests, allowing them to audit the systems and a) help bring awareness to personalization in search results, and b) do real-time tracking of misinformation. However, our work is showing that the observed differences in profiling are very dependent on the searcher’s location, which cannot be easily implemented in a public tool. Therefore, we are now splitting this interface in two: one with pre-trained “user-bots” and search patterns that can be used by the general public (still helping to raise awareness and “break the information bubble”, by showing how different people can be suggested very different search results), and a second for use specifically by NGOs and journalism/democracy related associations, which will be able to audit misinformation regarding select, ongoing situations. These interfaces were piloted during the European Researcher’s Night, in Lisbon, in 2023, with several participant pairs comparing search results.

Overall, this proof-of-concept project has produced an unbiased and powerful system to audit trackers and search engines, applied it to multiple situations and countries, and is being scaled up to include LLM-based chat bots. This last step is crucial, as online searches are increasingly being done through these interfaces and there is virtually no public knowledge on how they might fuel tailored disinformation. This project has helped advance our knowledge on the role of profiling and search engines in situations of potential or real conflict, with a strong emphasis on misinformation or low credibility content.

The proposed web interface is being redesigned but we expect to fully implement its new versions. By allowing citizens to realize that the same queries can return such different results, this knowledge could be used to help “burst the information bubbles”. In parallel, this tool might be useful to journalists and democracy watchdog associations to track disinformation narratives and was piloted in collaboration with two NGOs. Our expectation is to freely and openly share these tools with interested colleagues and other relevant actors, to increase its reach and potential impacts.

Fig. 1: FARE_Audit Framework

Fig. 2: Differences in search engine results

FARE_AUDIT: Fake News Recommendations - an Auditing System of Differential Tracking and Search Engine Results

Periodic Reporting for period 1 - FARE_AUDIT (FARE_AUDIT: Fake News Recommendations - an Auditing System of Differential Tracking and Search Engine Results)

Pobierz Pobierz zawartość strony