Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS

Socio-Semantic Bubbles of Internet Communities

Periodic Reporting for period 6 - SOCSEMICS (Socio-Semantic Bubbles of Internet Communities)

Periodo di rendicontazione: 2023-09-01 al 2024-08-31

Socsemics aimed to examine the existence of echo chambers in digital public spaces. These spaces are often assumed to foster interaction dynamics where users predominantly engage with like-minded individuals, reinforcing their beliefs rather than encountering diverse viewpoints. This phenomenon is widely believed to have a significant impact on democratic deliberation online, potentially deepening societal polarization. Prior studies demonstrated that users on blogs, microblogs, and website networks tend to cluster based on shared perspectives, forming what Socsemics calls "socio-semantic clusters"—groups that are cohesive both in terms of network structure and semantic content. However, existing research also suggested that the formation of such clusters might depend on the type of interactions (e.g. retweets vs. comments) and the topic (e.g. political vs. non-political). To systematically investigate these factors, the project relied on three key pillars:
- Socio-Semantic Network Modeling: developing formal models to analyze social networks where nodes have semantic attributes, assessing the extent of socio-semantic clustering across different online communities.
- Computational Linguistics and Opinion Attribution: creating tools to extract and categorize opinions from textual content, focusing on the sentence level and enabling the classification of user beliefs in a more nuanced manner than approaches based on word distributions.
- Qualitative-Quantitative Integration: conducting interviews augmented with network analysis and visualizations to compare computational findings with real-world user perceptions. A major goal of this pillar was also to develop innovative visualization methods to represent socio-semantic networks within a single space, offering new ways to analyze and interpret socio-semantic coevolution.
With respect to each of the above pillars,

1. Advances in Socio-Semantic Network Analysis
Focusing on online communities, the project contributed significantly to socio-semantic network analysis, a field that examines the interaction between social structures and semantic meaning, both:
- empirically: Socsemics analyzed how social groups align with semantic categories, particularly in relation to ideological divides (e.g. left vs. right-wing), stance on controversial issues (e.g. climate skeptics vs. non-skeptics), and geographical location (e.g. intra-EU vs. intercontinental). Contrary to common assumptions, the research found that not all online communities function as echo chambers. While affiliation-based networks (e.g. Twitter retweets) tend to resemble echo chambers, interaction-based networks (e.g. Twitter quotes) often display greater ideological diversity. This suggests that the nature of online engagement plays a crucial role in shaping discourse.
- methodologically: the project extended stochastic block modeling (SBM) to analyze social and semantic clustering beyond a single metadata category (e.g. political affiliation or topic preferences).
A habilitation manuscript and two book chapters further contributed to establish the sociological and computational relevance of combining social and semantic network analysis.

2. Advances in Computational Linguistics and Opinion Representation
To analyze user opinions in digital spaces, the project introduced Semantic Hypergraphs (SHs)—a novel framework for representing sentence-level meaning through directed, recursive hyperedges. Originally developed to categorize user stances on social media, SHs proved more relevant for semi-supervised information retrieval, particularly in contexts requiring rigorous, transparent interpretable methods—offering an alternative to large language models (LLMs), which often lack transparency. It allows efficient pattern-based extraction of structured information, enabling human operators to refine queries with minimal effort; and it configures an alternative to traditional semantic graphs, with potential commercial applications, particularly for companies requiring cost-effective, customizable information retrieval solutions. Recognition of its potential has led to an early-stage €150k application grant (starting in 2025) and an ERC Proof-of-Concept resubmission after earning a Seal of Excellence.

3. Advances in Socio-Semantic Network Visualization
The project developed novel ways to visualize social network structures alongside semantic properties by representing social and semantic elements in a unified hybrid visualization, mapping social clusters while considering semantic properties, revealing whether cohesive groups share similar semantics or not; and developing an interactive platform to assess socio-semantic fragmentation, showing how connections (or their absence) reflect structural and semantic patterns. Additionally, a new user sampling method based on structural modeling, rather than demographic quotas, was implemented. Combined with an augmented interview protocol (where users engage with visualizations of their network position), this approach represents a novel method in socio-semantic research.

Besides, three key software developments can be noted:
- graphbrain (github.com/graphbrain) an open-source Python library for semantic hypergraph analysis, that was used primarily within the project to extract structured claims from online discussions.
- metablox (github.com/lenafm/metablox) a stochastic block modeling tool for analyzing how categorical metadata (e.g. political affiliations) shape social networks.
- chronoblox (github.com/lobbeque/chronoblox) a visualization tool integrating structural and semantic properties over time, introducing "network chronophotography" for tracking dynamic changes.
Several key advancements from Socsemics are expected to have long-term impacts in social science, computational linguistics, and AI-based information retrieval:
- Semantic Hypergraphs (SHs) challenge the ubiquitous use of semantic graphs in knowledge representation and extraction (in academia and in the industry) by proposing a framework that both permets arbitrary complexity (thanks to the recursiveness of SHs) and remains amenable to human interpretation and processing, making it a viable alternative to opaque AI models like LLMs.
- Metadata-Informed Stochastic Block Modeling, that, first, accommodates metadata and semantic features, and, second, takes into account the non-exclusive contribution of various categories in the observed structure. This enables more nuanced analysis of how multiple semantic factors shape social structures, by fostering and empirically operationalizing the notion that many semantic dimensions may concurrently contribute to a network structure; a key development for network sociology.
- Joint Socio-Semantic network Visualization, proposing a unified approach for tracking structural and semantic changes in networks over time, in a "network chronophotography" that opens new avenues in dynamic network visualization, based on the novel idea that nodes, across various periods of time, should be placed in the same two-dimensional space based on the similarity of their semantic features.
- Refining the Echo Chamber Debate, by demonstrating that echo chambers exist in some online interactions but not others, resolving prior contradictions in research; especially when showing the co-existence of phenomena akin and not akin to echo chambers around the same content and on the same platform.
Project logos
Il mio fascicolo 0 0