CATCH-2004 will develop a multilingual, conversational system with a novel unifying architecture across devices and services. The system will provide pervasive access to multiple applications and sources of information available to citizens from public and private service providers by supporting multiple client devices, and by using multiple input modalities. Client devices are kiosks, telephones (standard and wireless) and smart wireless devices. Applications include access to information over the Internet, travel and city information/services, phone-directories and completion of transactions. A key aspect is to enable multilingual spoken access to information on the web using any of the above listed devices. The presentation of information to the user will be done according to the limitations of the client device. Kiosk access will support both conversational natural language and graphical interactions, access over the telephone will allow only for speech or push button input and spoken responses and wireless telephones and smart wireless devices will offer both speech and text based interfaces.
The main objectives of the project are to:
1. Enable Multilingual Conversational Access to Information for Citizens.
2. Describe an architecture for service provision unifying access from different client devices.
3. Provide multimodal (spoken & written) language support.
4. Provide voice-enabled access to the World Wide Web databases connected to this architecture.
5. Enable representation and transformation of information available in multiple forms and languages.
6. Develop a user-friendly system giving the ability to complete complex tasks to all users, experts and novices.
7. Contribute to open interfaces and standards.
The project will first identify the end-users of the demonstration applications in two of the participating cities (Athens and Helsinki). Information providers and system operators will be identified and application scenarios will be specified. The applications to be implemented will be chosen for three different communication means, namely voice telephone access, smart wireless communication devices and information kiosks. In addition, the identification and specification of the most suitable architecture, in terms of technical and functional aspects, for the three demonstrations will be done. One unified architecture will be defined as a core version. However, via functional protocols and application abstraction levels three adapted and refined architectural modules suitable for the different end-user devices will be delivered. The project will also design and implement the Speech Recognition module and develop the appropriate browser for the telephone, kiosk and smart wireless devices, as well as the conversational system for the three cities. In a next stage, the modules related to delivery of web content to smart wireless devices and enabling natural speech input on these devices will be defined and implemented. Then, the development of content and relevant applications will take place. The project will specify, develop and implement the Application Abstraction module, develop software implementing both multi-threading parallel systems and tools and methods for transformation of content into the most suitable format depending on the requesting device. Finally, the project will perform testing of all components of the system on several Complex Service providing platforms and verification of the functionality of the individual modules. Two demonstrators will be made available: one in the city of Athens in the context of the 2004 Olympic Games. Athens provides an excellent testing ground for the deployment of a large-scale system combining HLT and other technologies to facilitate visitors, participants and organisers. The other demonstrator will be established in the city of Helsinki to provide speech access to public city services. The city of Cologne will test and evaluate the usability of the technology and systems developed by the project. The project will disseminate its public results through a website and by writing technical papers or presenting key results in conferences or seminars. The project also intends to contribute inputs and directions to the IST Clustering activities.
The project has defined the following milestones:
1. Exploitation Plan;
2. User Requirements
3. System Architecture and Technical Specifications,
4.a/b/c. Speech-enabled WAP Browser
5. Functionalities of the Conversational Systems,
6. Speech Recognition & Synthesis Systems
7. Conversational System Data files
8. Functional Websites providing information for the demonstrations
9. Content annotation for wireless access;
10. Multi-modal Content Access
11. Operational Demonstrators.
Funding SchemeCSC - Cost-sharing contracts
15124 Maroussi - Athina