Service Communautaire d'Information sur la Recherche et le Développement - CORDIS


The Wordmap termfinder is a technology for extracting terms and ontology fragments from text. Terms and fragments are useful for semi-automating ontology construction, information extraction, and many other NLP tasks.

The existing code is structured as a single self-contained java library for easy integration with other products.

Key innovative features are: algorithmic efficiency in a combinatorially difficult domain. The possibility of using a rich target (C5, though Penn is supported.) The possibility of using multiple term extraction algorithms, including a new non-statistical method.

The Termfinder is mature code; Wordmap will begin demonstrations to customers in Q4 2004, and expects to sell products containing this technology in 2005.

The Termfinder also uses a standardized XML format for unambiguously specifying Brill Tagger transformations. We expect to open source this aspect of the technology as an NLP community resource.

Reported by

Wordmap Ltd
26 Upper Borough Walls
BA1 1RH Bath
United Kingdom
See on map