Community Research and Development Information Service - CORDIS


PARMENIDES Report Summary

Project ID: IST-2001-39023
Funded under: FP5-IST
Country: United Kingdom


The Wordmap termfinder is a technology for extracting terms and ontology fragments from text. Terms and fragments are useful for semi-automating ontology construction, information extraction, and many other NLP tasks.

The existing code is structured as a single self-contained java library for easy integration with other products.

Key innovative features are: algorithmic efficiency in a combinatorially difficult domain. The possibility of using a rich target (C5, though Penn is supported.) The possibility of using multiple term extraction algorithms, including a new non-statistical method.

The Termfinder is mature code; Wordmap will begin demonstrations to customers in Q4 2004, and expects to sell products containing this technology in 2005.

The Termfinder also uses a standardized XML format for unambiguously specifying Brill Tagger transformations. We expect to open source this aspect of the technology as an NLP community resource.


Will LOWE, (Senior Researcher)
Tel.: +44-122-5358184
Fax: +44-122-5358183
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top