ECRAN is a new technology for text extraction, providing linguistic resources adapted to the corpus (tuned dictionary). These technologies have integrated in such a way that the user is able to define his/her own extraction system independently of any domain. A set of tools has been produced and integrated to allow the end-user to define his own extraction templates. This implies 2 main activities: domain customization (identification of domain-relevant verbs and collocate) and semi-automatic template creation. The tools have been tested on different domains to ensure their portability. English, French and Italian demonstrators are available with full documentation. Experiments show that these tools correspond to the needs of users confronted with a large amount of documentation (TV programme browsing, business and economic intelligence, etc.). In addition, a market study and an exploitation plan are available concerning Information Extraction for on-line media. The dissemination of the technologies developed for ECRAN has also been studied.
Project URL : http://www.dcs.shef.ac.uk/research/ilash/Ecran