Objective
FACILE focuses on a pilot system to handle message dispatching and routing of texts within financial establishments, such as banks or trading companies. The system, which integrates shallow and deep text analysis techniques, supported by a rich domain model and a powerful pre-processor, will act as a real-time concept-based filtering and categorisation tool for short multilingual electronic texts and messages in the financial domain. Following a first rough and ready classification of texts to identify the main topics, a second, deeper analysis will enable information extraction on selected parts to produce a finer-grained classification.
Progress
The FACILE project provides a concepts based filtering and categorisation tool which understands the financial and business domains, is able to understand different languages and provides a rich and accurate categorisation of information.
The first year of work has focused, in fixing precise user requirements, in designing the technical architecture of the kernel categorisation system and in defining the application architecture. Capitalising on the experience of the preparatory Cobalt project, the first FACILE prototype, available at the end of '96, already provided significant coverage and will soon be made available at user's site for validation and elicitation of feedback. The first version of the prototype is able to cope with English and Italian texts, with German and Spanish applications ready soon after.
User studies have been carried out to support market survey and user requirements activities. Input to these has been provided by the user partners which include a savings bank, a rating company, and an information provider already providing an online information servicing to a range of SMEs, and other organisations involved in the wider interest group include banks and major financial institutions, news agencies and SME's.
User Requirements
The users surveyed all shared a common need for tools able to provide a sound and robust classification of domain specific texts, well beyond current capabilities of generic search and classification tools. The greatest advantage of a specialised tool for the business market comes from the integration of language engineering techniques with domain knowledge, to enable an intelligent, context-aware, classification.
The results of the user requirements survey and market assessment have identified a number of features that have been incorporated into the FACILE design, including, the ability to link to web information on the one hand and to local databases on the other, to handle different text formats and to structure incoming information flexibly.
Market Survey
There is a rapidly evolving market, with the Internet as the main driver for change. The amount of free and low-cost information available on one hand, and the lowering of service costs on the other, has completely changed the market. Once dominated by high cost services provided by players such as Reuters and Bloomberg, the Internet explosion has led to the development of sophisticated search and classification tools able to 'crawl the web' processing the large volume of information available at low cost.
The traditional information providers are also approaching the Web as a means to offer new services and provide new access channels to their information. New services, such as NewsScan and Infoseek are starting to appear with user customisation of news delivery. However, none of these services is able to provide multi-language coverage, nor takes into account linguistic features and content. Most of the services available only cover US or, at best, UK sources, and do not take into account the wider local European market. On the consumer side, the need for advanced tools for managing the information glut is noted by several studies, such as Firefly Communications, FT 16/10/96, showing that the information overload is now a frequent source of management stress, mental anguish and even physical illness.
All this clearly shows that the market is exploitable and, in a limited fashion, is being exploited for commercial purposes. Market size is difficult to assess, but numbers are so high (services such as Pointcast raised 1 Million users in few months) that even direct competition with a comparable service would be profitable.
A consequence of such a quickly evolving market, is the difficulty of defining a detailed business model which will have some useful future. Nevertheless, with some hypothesis the expected impact of the Facile system analysed. Encouragingly, the evidence is that a range of completely different figures inside business organisations, ranging from financial analysts to press office personnel, would benefit from a highly personalised service able to provide a detailed classification.
System Architecture
The Facile system is designed to enable the development of different applications and flexible integration into the user working environment. The design is composed of a kernel system, built up from different text analysis modules and able to act as a 'categorisation server', and by tools able to support development of specific categorisation applications.
The kernel system comprises text analysis modules and a set of declarative resources, specifying linguistic, domain specific, and control knowledge used to the drive text analysis. Declarative resources are partly fixed (base lexicons and dictionaries), partly application dependent (specific patterns and control knowledge), and largely language independent.
The main components of the system are:
- the pre-processor, which handles different text input formats, normalisation, segmentation and tokenising. It provides morphological information on tokens and some semantic information about selected tokens, such as proper names,
- the shallow analyser, which performs a first level categorisation of texts based on pattern matching techniques, supported by domain knowledge.
- the deep analyser, which extracts further information according to the selected categories of interest, thus enabling a richer categorisation.
- the control module, which manages the control flow across modules.
The Way Ahead
In 1997 the consortium will work on developing the system and making it available for evaluation. Preliminary activities, and existing prototypes, have enabled initial development of the product on a sound basis. The plan for immediate involvement of the users in early testing and evaluation of the system as soon as new versions will be made available will ensure that the system will be made to cope with real world texts.
The ability to satisfy user requirements, and the performance of the system will be measured both in the field and with a carefully selected and sound evaluation methodology. This coupled with the fact that there is an identified market for a multi-lingual classification system should ensure the success of FACILE derivatives.
One of the preconditions for increasing the competitiveness of European companies is the accessibility of information about events, market trends, competitors' actions and new products. Because of the increasing amount of news available, important facts are often hidden in large quantities of information. The success or failure of a company can depend on its ability to find the right facts at the right time.
The FACILE project will produce a concepts based filtering and categorisation tool able to understand different languages, which will transform the way people access and work with electronic text in the financial domain. Our target sector is the financial one because here we foresee a concrete business opportunity opened by the lack of real Language Engineering in the existing specific platforms.
We will provide both a software engine, and complete end-user solutions for this vertical market. Our software will combine the sophistication of concrete language engineering technologies with ease of use, to provide more performance than simple Boolean, keywords or statistical methods.
The concrete applicative direction is guaranteed by an important end users presence within the core consortium and in the Interest Group that will constantly be backing and evaluating project work: our interest group is mainly composed of institutions that are final users of information and news providers.
Evaluation, performed in strict collaboration with end-users, will be a continuos activity during the project lifetime and quality assurance guidelines and procedures will be set up during the first phase of the project itself.
Progress and results
The industrial companies in the FACILE consortium already have the strong position necessary to make an impact on financial market and plan to use the results of the project on a large scale to make a definite jump into the telematics services arena. The outcome of the project will be a very effective and fast way to manage documentation coming from leading telematics suppliers. The result will be the right news in the right hands.
Fields of science
- natural sciencescomputer and information sciencessoftware
- natural sciencescomputer and information sciencesinternet
- natural sciencescomputer and information sciencesdatabases
- natural scienceschemical sciencesinorganic chemistrytransition metals
- social scienceseconomics and businessbusiness and managementbusiness models
Topic(s)
Call for proposal
Data not availableFunding Scheme
CSC - Cost-sharing contractsCoordinator
20124 Milano
Italy