Work performed:
Developed an interface to link words in poetry in the corpus to the dictionary (lemmatisation), including variants from different manuscripts of the same poem
Performed lemmatisation of the published corpus (100%) and further editions in progress (in total 117,000 words)
Added and lemmatised variants from the corpus (approx. 25% completed)
Developed an interface for viewing and querying the resource, as well as for editing and updating it (
http://lexiconpoeticum.org(se abrirá en una nueva ventana))
Connecting the wordlist of LP to the existing Dictionary of Old Norse Prose (ONP): a mixture of automated linking and manual linking and checking
Working with ONP to make it more accessible and available to other projects so that LP can continue to link to it
Developed a system for importing XML-based corpora and either or both automatically and manually lemmatising them
Incorporated Codex Regius project’s XML edition to supplement the Skaldic Project’s coverage of the poetic corpus (resulting in >99% of total target corpus available)
Started developing an ontology for the words in lexicon based on native ontology as found explicitly in native poetological works and implicitly in the extended diction system used by poets
Developed quantitative methods for understanding the lexicon including a method for comparing lexicon size for different sized corpora
The project so far has resulted in a number of methodological advancements, the development of a very large public resource, and some initial results in the quantitative analysis of the lexicon in relation to the corpus:
A new method for efficient assisted manual linking of corpus words to dictionary headwords (presented at DHN 2017, Euralex 2018)
A new method for highly accurate automated linking of words to dictionary headwords (presented at Euralex 2018)
A new method and results in comparing different sized corpora, taking into consideration the non-linear nature of lexicon size in relation to corpus size (presented at Saga Conference 2018)
A native ontology implemented for semantic classification of words in the lexicon (presented at ICHLL9)
Modelling of lexicon size based on identifying constants for each corpus that can be used to predict lexicon size according to corpus size
The progress of the project is also documented on the project web site (See
http://skaldic.abdn.ac.uk/m.php?p=doclp&i=989(se abrirá en una nueva ventana))