Periodic Reporting for period 1 - PHILOGLOSSA (Pre-Hellenic Loanwords in Greek: Lexicon of the Substrate Analysed)
Período documentado: 2023-07-01 hasta 2025-08-31
To judge from the amount of vocabulary in Ancient Greek without inherited (i.e. transmitted from reconstructed Proto-Indo-European) it is clear that prehistoric language contact events played a significant role in the early evolution of the Ancient Greek language. These words are found in semantic areas which could be expected for an incoming linguistic community to borrow such as vocabulary for local landscape and weather phenomena and plants and animals, but also in areas that are indicative of how incoming Greek speakers integrated with and co-evolved pre-existing economic and societal structures and religious practices in the Aegean. The nature of the language contact scenario is, however, poorly understood since the languages that early Greek was in contact with left behind either no linguistic remains of their own, or only as yet undeciphered documents, which has made the identification and historical interpretation of these loanwords controversial in the literature.
The PHILOGLOSSA project aimed to bridge past approaches to prehistoric loanwords in Ancient Greek, make the materials more easily accessible and to advance new and more reliable methods for their identification and analysis. The first goal of the project was to create an open dataset of the materials alleged in the literature to be prehistoric loanwords from non-Indo-European sources that could serve as the basis for further analysis using large-data approaches. Importantly, this dataset was created to include contextual philological metadata to assist with the comparative analysis of the materials, as well as bibliographic references to key literature in order that it also be useful as a point of reference. The second major goal of the project was to use the dataset to devise methods for identifying linguistic features that could be used for classifying or diagnosing loanwords from common sources. Additionally, the project aimed at bridging gaps in the literature between historical linguistics and archaeology by explicitly including archaeological perspectives into the methodological analysis.
1) A first pass of lexical data collection was carried out from the scientific literature which have identified alleged prehistoric loanwords as the primary basis for the dataset. Throughout the review process further lexical data continued to be added to the dataset as it was identified resulting in a dataset of 2976 unique lexemes in the final released (1.0) version of the dataset. This initial phase of data collection also assigned classifications to groups of variant vocabulary that likely share some common history. Of the 2976 lexemes in the dataset ultimately 962 unique classes were identified. Each lexeme was also given a semantic classification for an additional way in which the data could be manipulated and filtered.
2) The next phase of the project was to add attestation data to the dataset and to verify the linguistic forms as they are attested in published text editions of ancient authors and documentary texts (inscriptions and papyri). This stage addressed one of the shortcomings of the literature where this information is not always systematically recorded, but is crucial for establishing a context for each lexeme which in turn provides criteria by which the philological reliability of a given data-point may be assessed.
3) The third phase of the project was to devise a system for philological reliability judgements based on a review of the philological data. These were made on the basis of various criteria including the date of the text’s composition (which for Ancient Greek may range from second millennium BCE Mycenaean Greek texts to texts from Late Antiquity from the latter half of the first millennium CE), modality of transmission (authors transmitted by the medieval manuscript tradition, epigraphic documents or papyri), and textual genre (literary, documentary, or grammatical literature). This provided an additional measure for which the relative reliability of data could be assessed across the dataset as a whole.
4) As the philological reliability judgements were implemented, we devised methods to attempt to more precisely identify recent prehistoric borrowings using an archaeolinguistic approach. A pilot study was devised taking lexical data in semantic fields pertaining to new technologies and construction terminology particular to the archaeological culture of the Late Bronze Age Aegean. From a survey of the high-reliability lexemes and their variants in this sample we identified three phonological features and one morphological feature that could potentially be ascribed to a prehistoric language recently in contact with Ancient Greek.
5) The final phase of the project incorporated a comprehensive bibliographic review of the most relevant literature (etymological lexica, specialised lexicographical studies, individual studies) for all lexical data included in the dataset and the annotation in the notes field where alternative analyses are possible or have been proposed.
6) In addition to the dataset which was the primary output of the project, the project produced 3 scientific papers and 4 conference presentations. The project also organised an international conference titled “New Perspectives in the Early History of Ancient Greek” at the host institution.
7) The project also contributed to the professional development of the researcher through training activities undertaken in Balto-Slavic historical linguistics and dialectology, whose better documented dispersals and recorded history of language contact provides an important parallel case study to the main research activities of the project.
Potential future developments of the project include: (a) establishing collaborations with researchers in prehistoric archaeology (including archaeobotany and zooarchaeology) to integrate the linguistic data with concrete archaeological data in order to better elucidate the historical contexts when and where loanword borrowing events took place; and (b) build an online web-based interface for the dataset in order to make the published dataset more accessible and make the information able to be more systematically exploited by both researchers and the general public.