Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-05-07

SHALLOW PARSING AND KNOWLEDGE EXTRACTION FOR LANGUAGE ENGINEERING

CORDIS provides links to public deliverables and publications of HORIZON projects.

Links to deliverables and publications from FP7 projects, as well as links to some specific result types such as dataset and software, are dynamically retrieved from OpenAIRE .

Exploitable results

SPARKLE provides advanced methods and tools for powerful, flexible and automatic acquisition oflexical information from text corpora. The tools fall into two categories: - robust, shallow parsers of unrestricted text, and- lexical acquisition systems, capable of learning (from pre-parsed texts) aspects of word knowledge needed for language Engineering applications. The tools are based on up-to-date, finite-state technology and were originally developed for statistical and inferential routines for efficiently resolving data deficiencies. The methods are applicable to any type of text and were tested with remarkable results in English, French, German and Italian. SPARKLE is able to acquire lexical information for verbs - probably the most elusive and challenging category for lexical analysis - as well as being the most important for Language Engineering applications such as machine translation, information retrieval and speech recognition. The variety of syntactic patterns typical for a verb is detected efficiently, and then statistically validated and automatically typed with respect to semantic preferences. SPARKLE technology has been used for intelligent cross-lingual text editing and translation filtering within multilingual information retrieval systems (Xerox, Sharp), and speech recognition systems (Daimler-Benz), and has demonstrated a steady improvement in performance. Acquired information was also used for automatic word sense disambiguation. Work in SPARKLE actively contributes to efficient development in: automatic parsing of unrestricted text, computational lexical databases, speech dialogue systems, cross-lingual information retrieval, exchange and filtering. Project URL: http://www.ilc.pi.cnr.it/sparkle.html

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available

My booklet 0 0