Wider access to Europe's cultural heritage
The CHLT project aimed to adapt discoveries from the field of computational linguistics, natural language processing and information retrieval to help researchers in the humanities advance their work. Research at the Københavns Universitet in Denmark focused on which of these techniques are language independent and optimised them for the needs of old Norse saga texts. Researchers from the department of humanities created electronic transcriptions and high resolution images of the Old Norse sagas from their collections, while the Scandinavian Department at UCLA handled their processing. Lexical resources were digitised, for this purpose, and morphological analysis tools were written that can reduce inflected forms to their possible lexical forms. In specific, one parser takes the inflected form as its input and outputs the lexical form, and the other one generates complete paradigms for the word based on its lexical form. The source data for these parsers is drawn from Geir Zoega's "Concise Dictionary of Old Icelandic". The emerging digital libraries made these computational techniques also available to non-experts, not only to individual scholars as generic tools packaged separately from the texts upon which they operate. The Perseus digital library environment allows high resolution images from rare and fragile printed books and manuscripts to be viewed alongside automatically generated hypertexts. These hypertexts provide the reader with increased capabilities of browsing lexical forms along with frequency data, links to entries for a word in dictionaries and grammatical aids. The users of this functionality can range from experts to novice readers; including students and the general public. Further collaboration is sought in order to establish an international framework with open standards for the long-term preservation of data, the sharing of metadata and interoperability between affiliated digital libraries.