Service Communautaire d'Information sur la Recherche et le Développement - CORDIS


PARMENIDES Résumé de rapport

Project ID: IST-2001-39023
Financé au titre de: FP5-IST
Pays: United Kingdom


The Wordmap termfinder is a technology for extracting terms and ontology fragments from text. Terms and fragments are useful for semi-automating ontology construction, information extraction, and many other NLP tasks.

The existing code is structured as a single self-contained java library for easy integration with other products.

Key innovative features are: algorithmic efficiency in a combinatorially difficult domain. The possibility of using a rich target (C5, though Penn is supported.) The possibility of using multiple term extraction algorithms, including a new non-statistical method.

The Termfinder is mature code; Wordmap will begin demonstrations to customers in Q4 2004, and expects to sell products containing this technology in 2005.

The Termfinder also uses a standardized XML format for unambiguously specifying Brill Tagger transformations. We expect to open source this aspect of the technology as an NLP community resource.


Will LOWE, (Senior Researcher)
Tél.: +44-122-5358184
Fax: +44-122-5358183