Periodic Reporting for period 1 - ARGOT (The Anarchist Translation Flows and World Literature Project (ARGOT))
Reporting period: 2023-05-02 to 2025-05-01
Anarchist periodicals worldwide historically offered readers an uncommon and often outlawed experience of world literature. Yet, their countless literary translations, as well as the editors and translators involved in this circulation, have been largely ignored by anarchist, literary and translation studies. Data driven approaches, along with the rise of digital archives, make it possible, for the first time, to carry out a comprehensive, large-scale study of these translations by considering periodicals in multiple languages, a wide geopolitical space (several port cities in the Americas and Southern Europe) and a critical period for both literary and anarchist history, 1890-1910. Thus, ARGOT will unveil the significant contribution of anarchist periodicals to world literature by taking into account the people who enacted this transfer. With a specific focus on women and their role in thriving translation flows, ARGOT pursues three main goals: 1. to identify the literary translations that appeared in anarchist periodicals in Montevideo, Buenos Aires, Río de Janeiro, Havana, New York, Lisbon or Barcelona; 2. to rediscover the cultural mediators involved in these translations (particularly women), and 3. to address the multilingualism of anarchist communities through a spatial approach. These goals will be achieved using an innovative, interdisciplinary methodology that combines data mining and computational science, anarchist, literary and translation history, and women’s and gender studies. By making freely available two open-access reusable databases on translated texts, translators and publishers, ARGOT will shift our understanding of anarchist periodicals in the circulation of world literature and shed light on exciting new connections. From a personal perspective, the MSCA-IF will greatly contribute to my career as a strong and highly skilled researcher in translation, literary, anarchist studies and digital humanities.
Phase I (WP 3: Identification of sources and preliminary cataloguing)
A corpus of 30 relevant periodicals published in the cities of Barcelona, Buenos Aires, Montevideo, New York and Rio de Janeiro was established. Insufficient periodicals available made it impossible to include Lisbon and Havane as previously planned. The selection reflects the linguistic diversity of the anarchist press during this period: the journals are written in French, Spanish, English, Portuguese, Catalan, or Italian, or are multilingual. It also demonstrates the variety of anarchist periodicals in two important aspects: their variable periodicity (daily, weekly, monthly, and irregular publications), and the coexistence of the very few stable periodicals with a greater number of short-lived ones. It also represents different tendencies within anarchism (e.g. individualism, organisationism, trade unionism, anti-clericalism). In addition, where possible, at least one woman-led periodical was included in the selection for each city. Already-digitised, open-access journals were preferred over those that were not available; however, in some cases, the relevance of a journal within the context of the corpus (e.g. being the only journal in a specific language for that city) justified ordering scans from the institutions that hold them. Accords were concluded with these institutions to make this scans available in open access, which was the case with AHL - Arquivo de História Social - Universidade de Lisboa, Portugal; HMM - Hemeroteca Municipal de Madrid, Spain and
IISG - International Institute for Social History, Netherlands.
Main achievement (Milestone 1): subdataset of a representative and digitised corpus of periodicals.
Phase II: (WP 4: Document downloading and digitization)
The corpus was homogenized into machine-readable digital text. The transcription was carried out using Transkribus, a platform for the text and structure recognition of historical documents (University of Innsbruck). After testing various transcription approaches—including training a specific model (using 60 manually transcribed pages representative of the corpus) and reusing an existing model trained on a corpus of Filipino newspapers—which proved unsuccessful, the following procedure was adopted:
Layout recognition was performed using Advanced Layout Analysis. The Universal Lines Model (Accuracy (mAP): 8.94%) was applied, with image quality upscaled and the baseline accuracy threshold set to “low.” This model and its adjustments proved effective in detecting nearly every line on the periodical pages, except for headings, whose complex typography often caused issues. However, the model does not allow for the separation of lines according to column layout; often, the detected line spans multiple columns. Testing of the beta "field recognition" feature to address this issue proved ineffective.
Text recognition was conducted using Transkribus Print M1, a multilingual print model with a Character Error Rate (CER) of 2.20%. It has been trained on most of the languages present in the corpus, with the exception of Catalan. The text recognition was affected by several issues inherent to the corpus:
a. Poor preservation of the original documents and/or the quality of the scans, leading to failures in character and word identification;
b. Use of language varieties that are diachronic (dating back 110 to 130 years) and filled with jargon and specific vocabulary related to the political and social movements of the time—terms often absent from current dictionaries;
c. Low literacy levels of some contributors, resulting in grammatical and orthographic inconsistencies;
d. The presence of multiple languages within the same periodical, which could confuse the model.
As a result, the transcribed text is mostly accurate in terms of the words present on the page, but the word order may differ significantly from the original.
Main achievement (Milestone 3): subdataset composed by txt files with the extracted text, one file per issue.
Phase IIIa (WP 5: Search and extraction)
The txt files were filtered, line by line, in order to identify names of authors, a step prior to the detection of translated literary texts. Python-based routines were used for the processing, and PostgreSQL was used as the database to store the information. The following pipeline was implemented to do this, after testing diverse combinations:
Line-by-line reading of the text file exported from Transkribus
Removal of whitespace and special characters from the text
Elimination of lines with two characters or fewer, considering them as noise generated during the process. Elimination of lines containing numbers, based on the observation that in these periodicals, names often appeared in connection with monetary amounts (e.g. subscriptions), in order to avoid registering names that were irrelevant to the aims of this study.
Generation of a list with all lines of the processed text
Language detection of the text
Selection of 10 lines from the text, chosen by random sampling from the total lines
Language detection using the Python library langdetect and loading of the corresponding language model, using models from spacy. Based on the possible languages present in the texts and the evaluated accuracy levels in the NER processes, the following models were used:
Spanish – es_core_news_lg
French – fr_core_news_lg
English – en_core_web_trf
Italian – it_core_news_lg
Portuguese – pt_core_news_lg
Line-by-line detection of personal names appearing in the text, using NER
Preservation of specific metadata (name of the periodical, year, issue number, name of the processed file, line number), to allow accurate location of the name within the corpus
Filtering of tokens to retain only those labeled by the model as "PER" or "PERSON". Then, further filtering keeps only those that begin with a capital letter, to eliminate what are mostly false positives
Addition of a column listing the names found per line
Insertion into a PostgreSQL database of all processed lines, allowing specific queries to be run for the project
A second round of name filtering is carried out as follows:
Select from the database those lines in which names were detected
Query the VIAF API to suggest possible matches for the name detected by NER, limited to three suggestions per case and limiting the date of birth of the authors suggested, to exclude authors that may have not been born at the moment of the publication.
Add, for each case, the VIAF ID (viaf_id) and the associated name to the existing data
Main achievement: PostgreSQL subdataset containing lines with authors, to allow for querying and refining for specific purposes
Phase IIIb
The authors automatically identified in the previous stage undergo a manual revision in three steps:
1. Every author suggested by VIAF is evaluated within the context of the line and validated or not as a “probable author” (validation involves considering their condition as a “world literature author” and the positive identification of the name as quoted in the original text);
2. Every “probable author” is searched in the issue and line indicated;
3. If there is a translation signed by this author, the translation text is registered, adding relevant metadata, such as source and target language, translator, and source text.
Main achievement (Milestone 4): sub database of translated texts (in progress)
Phase IV - V (WP 6: Analysis)
Tracing of circulation patterns of literary translations and identifying cultural mediators using data visualization and archival and biographical research. Macro-analysis of most translated authors, most and least frequent source languages, usual literary genres, accreditation or not of translators in the corpus (in progress). Interdisciplinary approach to the analysis of data, using close-reading and a sociological and hermeneutical approach.
Main achievement: publications and papers in international conferences on the following topics:
-Distribution of translated authors in the anarchist press vs. the cultural press (Campanella, L.; Fólica, L. & Ikoff, V. (accepted). ‘From Authorship to Translation? Macroanalysis of Foreign Literature in Rio de la Plata Periodicals of the Early 20th Century’ In: D. Roig-Sanz & Ph. Hofeneder (Eds.) Digital Translation History. Processing Historical Data with New Methodologies. Amsterdam-Philadelphia: John Benjamins)
- Distribution of translated authors in the anarchist press vs. other political press (Campanella, L. ‘Políticas de la traducción literaria en algunas revistas politizadas hispanoamericanas del cambio de siglo’. XX AHILA (Asociación de Historiadores Latinoamericanistas Europeos) Conference, Università di Napoli L'Orientale, Italy 2-6.9.2024).
- Multilingualism: circulation of texts in translation in different linguistic contexts (Campanella, L. (in preparation). ‘Littérature et traduction littéraire dans la presse allophone et multilingue anarchiste à Buenos Aires (1890-1900)’. Revue des langues néo-latines).
- Women writers, translators and publications directed by women in a predominantly male publishing circuit (‘Translators and Literary Translations in Anarchist periodicals in Spanish-Speaking cities: building a gendered corpus’. International Conference Translation and the and the Periodical, Ghent University, Belgium,13-15.9.2023)
- Creation of an ‘anarchist canon’ and author consecration in the anarchist framework (‘Comrades Ibsen and Mirbeau...’ in Fernández and Laura Galián (eds.), Translation and Anarchism: Resistance, Expansion, and Renewal, accepted for publication).
- Translation publishing practices, including ad hoc translations and the reuse and reprinting of uncredited translations (Campanella, L. (2024). ‘Zurrando a los pobres. Translations of Baudelaire's Spleen de Paris in El Pueblo, Montevideo, 1905’. Revista Chilena de Literatura, 109, 85-128; Campanella, L. (in preparation). ‘Subverting the Parisian Méridien: The Circulation of Translated Literary Texts in Rio de la Plata Anarchist Periodicals, 1890-1910’. In: E. Bournot; J.J. Locane & R. Almendros (Eds.). Beyond the World Capital of Letters: Negotiating the Literary across the Global South. Bloomsbury)
- Circulation of translations (Podcast ‘Textos viajeros / Travelling texts’, Episode 1 ‘Los tejedores, de Silesia a Buenos Aires’ ; Episode 2 ‘The rag-picker, from Paris to New York’)
- Characterisation of anarchist women translators and editors (Campanella, L. and Migueláñez, M. (Eds.) (in preparation). Anarchist Women Translating Ideas: Multilingualism and Intermediality. Amsterdam: De Gruyter; Migueláñez, M. and Campanella, L. (Eds.). Anarquistas editoras. Biografías políticas en femenino. Granada: Comares)