Service Communautaire d'Information sur la Recherche et le Développement - CORDIS

Descriptive lexical specifications and tools for corpus-based lexicon building

Tools are being developed for efficient lexicographic corpus construction, exploration and selective retrieval of lexicographically relevant material. It provides an easy to use and well documented descriptive scheme for lexical representation, improving consistency over manual and semi-automatic data entry. The tools also support importation and exportation of lexical information. The project focuses on lexicon-based syntactically oriented retrieval of corpus evidence from morphosyntactically and syntactically annotated text corpora (search condition generator) and exemplification with a fragment of semantically, syntactically and morphosyntactically described verbs of perception and communication in 5 languages. The project has produced prototypes of a data entry facility: typed feature structure (TFS)-mode for emacs; hierarchy viewer, TFS viewer and a more widely usable Search Condition Generator is in preparation. Several hundreds of sentences have been encoded in detail for each . A TFS dictionary has been produced with entries for perception verbs of English, French, Italian, Danish and Dutch, related to the corpus sentences. In addition reports on the methodology with detailed examples are available. The TFS-based tools are generic and can be used with any feature-structure based linguistic representation.

Contact

Ulrich HEID, (Project Manager)
Tél.: +49-71-11211373
Fax: +49-71-11211366
E-mail