MAY will provide multilingual access to Yellow Pages on the World-Wide Web, enabling users to register queries in their native vocabulary, searching and retrieving entries for the same services offered in different countries. The system will employ concept-based search techniques, obviating the need for familiarity with any coded language, or with yellow-page language, terminology or traditional headings and indexes. A single query will also give easy access to differently structured and indexed directories. These techniques are expected to have an impact on application systems and services other than the one targeted by this project.
In its first year the MAY Consortium has focused on four major strategic tasks.
1) The users requirements study has collated and analysed the needs of users to enable specification of a target system, i.e. a multi-lingual service allowing the end users to query various Yellow Pages databases. Existing online Yellow Pages services available in Europe have also been surveyed so that the basic functions required of a monolingual service are known.
2) Lingware specifications cover the necessary models and tools (based on Erli's lingware) for processing of users queries in different languages, and the system architecture which provides access to different Yellow Pages databases.
3) Development of the required language resources, consisting of a prototype German dictionary and grammar (French and English resources are all ready available) which will allow processing of a set of queries in the field of transport and tourism.
4) Specification of a MAY prototype Web server available at the end of the project. This server will support access to German and French Yellow Pages databases in three languages.
Having completed these specifications, the project is moving on to integrating the linguistic resources that have been developed and implementing the model for accessing the Yellow Pages nomenclatures and databases.
Study of the Users requirements
Three basic studies were launched for the definition of the users requirements covering:
- existing online Yellow Pages in Europe,
- the different types of users (data providers, end users, advertisers, ...),
- the technical environment (in order to define a viable application and service).
These studies showed that Yellow Pages is one of the most familiar applications in the field of online access to information, facilitating raising of issues for this area (help functions, users understanding, information retrieval, business impact). On the one hand, services offer similar functions and facilities, on the other hand they all present different ergonomic features, search criteria and information retrieval processing.
One of the conclusions is that the main issue remains understanding the requirements of the end user who searches for a professional or service in his own language, and that an intelligent system must be used for understanding and retrieving the appropriate answers.
Linguistic Processing and Architecture
The specifications that have been drawn up describe how to access Yellow Pages reference data, i.e. nomenclatures and sample text descriptions of the professionals, starting from user queries.
The model describes how the Yellow Pages nomenclatures will be indexed, using structured formulae (tree-based structure of multi-lingual concepts). These formulae will be used as textual descriptions of the professional activities, products and services that compose the semantic field of each nomenclature entry. In other words, each query once analysed will be 'matched' with the textual descriptions of each heading or nomenclature entry.
Data management and query processing will be performed using existing software and resources. The management of indexes will be performed using existing tools directly derived from the Transterm management workstation. This software allows the building and storage of terminology for an application, and construction of a multi-lingual representation of terminology.
Specification of the Prototype
One of the major functions of the prototype is in headings retrieval. Headings retrieval processing will be implemented in the target application based upon the linguistic architecture and use of indexes as described. These specifications describe the use of indexes composed of language independent concepts according to a model of roles. These indexes can be downloaded as a database which will be queried once the user's input is analysed and converted into an internal representation.
To ensure development of a fully working prototype, simplified models and processing have been specified. For instance, the initial lexicons will not cover the entire Yellow Pages, but be restricted to selected areas (tourism and transportation). The prototype will consist of a Web server with:
- a MAY home page including appropriate form fields to fill in,
- communication with the natural language processing server which processes queries,
- access to an Oracle database including German and French databases.
The Way Ahead
The most important task in the future will be setting up the MAY prototype. The Web server of this prototype is to be available by the end of May '97 which is the end of the project's first phase. Other activities will concentrate on promoting and disseminating the project results toward new partners and the scientific community and defining a plan for the subsequent project stages.
The MAY project aims at specifying a system enabling end users to query international Yellow Pages in their own native language. The user community consists of end users searching for professionals in a foreign country. The user community also includes the telecom providers and publishers who constitute some of the major relevant players for the European commercial and industrial activities.
Information providers (printed media and online services) are faced with the task of adapting their applications to new evolving technologies to improve their ease-of-use, functionality, accessibility and market prospects.
The MAY project is characterized by three major objectives:
- specifications of an application taking into account all possible users requirements,
- use of models, linguistic resources and processors in accordance with the Genelex, Graal and Transterm projects,
- development of a prototype (access to the German and French databases, including the processing of queries in French, English and German) in order to validate the approach.
Past European project deliverables are the basis of the MAY linguistic architecture; the Structured Indexing model developed by GSI-Erli, for example, will be used to index both German and French nomenclatures.
Progress and results
The main results of the MAY project are:
- system functional specifications according to users requirements,
- description and specification of the MAY linguistic architecture,
- use, validation and development of linguistic resources in French, English and German,
- a prototype developed as the sound basis of the final system.
The MAY consortium will participate in several EC events disseminating all available information on the project and progress.
The linguistic architecture of the prototype is based upon results of existing European projects and generic re-usable models and resources, such that close collaboration can be expected with other projects.
The long-term vision of the project is that of an international multilingual Electronic Yellow Pages service, which will be set up as a joint venture with national partners in each country. A presentation of the project can be given to inform any telecommunication operator interested in participation. Presentations of the project will be proposed to EURESCOM, an organization which includes all major European operators. The interest of the project is to involve as many telecom applications as possible, as many languages as possible, as many Yellow Pages systems as possible.
Funding SchemeCSC - Cost-sharing contracts