CORDIS - Résultats de la recherche de l’UE
CORDIS
Contenu archivé le 2024-05-18

NEgotiating through SPOken Language in E-commerce

Livrables

We have developed a comprehensive methodology for data collection for speech-to-speech translation systems in the course of the NESPOLE! Project. We collect a corpus of speech dialogues in two distinct methods: - A simulated "Wizard-of-OZ" scenario with cooperative users performing the expected task in the domain of interest. This data is collected mono-lingually (both participants speak the same language) in the various languages of interest. It is then transcribed and annotated. - Data collection using a beta-version of the actual speech-translation system, with naïve users performing the task in the domain of interest. This data is collected bi-lingually, using the actual translation capabilities of the system. The data is then transcribed and annotated. Both types of data are used for analysis of the language variability in the domain, training and testing of individual translation components. A representative sample of dialogues collected under both conditions is set aside from the start to serve for comprehensive performance evaluation of the system at important points in the project. The methodology developed in NESPOLE! should prove extremely valuable for other speech translation projects in the future.
Two user studies were conducted by means of the real system integrating multilingual and multimodal communication, i.e. naïve users were asked to use the fully implemented system to accomplish a given task in the tourism domain. The first user study aimed at evaluating the added value of multimodality in Showcase 1. We found evidence that multimodality positively affects the quality of interaction by making it easier to resolve ambiguous utterances and to recover from system errors, improving the flow of the dialogue and enhancing the mutual comprehension between the parties, in particular when spatial information is involved. The second users study aimed at analysing more deeply the communication features of the Nespole! system (Showcase 2a version) and at comparing the results with those of the first user study. The main result was that as the system becomes more effective the interaction becomes more natural, though it clearly emerges a lack of coverage of meta-communicative turns. A general consideration was that the usability engineering methods (based on number of errors, task completion time, etc.) are not appropriate to evaluate complex communication supporting systems. Dialogue analysis measures (based on turn repetitions, turn taking, topic change, etc.), seem to be more suitable to describe the process and how the system supports (or hampers) an effective computer mediated conversation.
The NESPOLE! Interlingua representation is a language-independent symbolic meaning representation framework, used as the centre-point of our speech translation approach. Translation in NESPOLE! is performed by analysing a source language sentence into the interlingua representation, and then generating a target language sentence from the interlingua. The interlingua representation developed in the project is suited for task-oriented language in limited but large domains. This interlingua, called Interchange Format (IF), consists of four representational components: a speaker tag, a speech act, a list of concepts, and a list of argument-value pairs. The main principle underlying the design of the IF is that a task-oriented domain can be described by a reasonable number of domain actions. Domain actions are tasks such as giving information about the availability of a room, thanking, greeting, etc. In the IF, the domain action is represented by the combination of speech acts and concepts. We currently have a travel domain database of around 8000 IF-tagged sentences in German, Italian, English, Korean, and Japanese. Around 800 different domain actions are used, with the 50 most common domain actions covering about 65% of the sentences. The NESPOLE! IF is currently in use in several other speech-translation projects, including C-STAR consortium partner groups in Japan, China and Korea. Extensions of the IF will be used in future speech translation projects. Of particular value is the IF database of sentences tagged with IF representations.
The main characteristic of the NESPOLE architecture is the flexible distribution of the modules used for data and video communication, audio coding/decoding and for the translation process (recognition, analysis, generation and synthesis). These modules communicate through IP. Thanks to the architecture users would be able to use NESPOLE! Service from their home PC, talking with a remote agent, seeing her face and expression through a web-cam, exchanging also visual information through shared maps and images where both the users can make drawings or selections. Aethra has contacted a company interested in adopting the NESPOLE architecture in order to offer this kind of service to its clients. To make the service available to a large number of people a study about modules modifications and improvements is in progress.
We have developed a comprehensive methodology for evaluating speech-to-speech translation systems in the course of the NESPOLE! Project. Evaluations are performed on dialogues collected in the domain of the system that were set aside and not used for system development. The evaluations are for the most part end-to-end, from input to output, not assessing individual modules or components (except for the recogniser). We perform both monolingual evaluation (where generated output language was the same as the input language), as well as cross-lingual evaluation. We evaluate on both manually transcribed input as well as on actual speech recognition of the original audio (automatic transcription). In addition we grade the speech-recognised output as a "paraphrase" of the transcriptions. Evaluation is performed by multiple human judges that assess the quality of translation for each translation unit. For reliability, we use three judges for each evaluation session. The grading scheme is a four-point scale, fully based on meaning and does not take fluency and grammatical accuracy into account. We calculate and reported average and majority scores across graders for each SDU. We calculate inter-coder and intra-coder reliabilities for the different judges, and we compute statistical significance measures for differences in performance of comparable translation systems or settings (such as different versions of the system, translation back to the source language versus to the target language, etc.) The evaluation methodology we developed can be directly applied to other MT projects and systems.
NESPOLE! System has been developed using two scenarios: - The tourism scenario. - The first aid medical assistance scenario. During the project life three main data collection have been carried on in order to develop the first and the second showcase. During the first year 191 dialogues have been collected. There are 62 German dialogues recorded, 61 Italian, 37 English and 31 French. Particularly an amount of 6 hours of dialogues for Italian and French, 7 hours for English, 8 hours for German has been recorded. Dialogues were about five predefined tourism scenarios. During the last year two major data collections have been carried on: the first one aimed at expanding the tourism scenario and the second one at addressing the medical domain. For the monolingual data collection five tourism scenarios were developed; 66 dialogues were recorded yielding 994.57 minutes of data: 243.52 minutes comprised in sixteen English dialogues, 246 minutes in sixteen German dialogues, 272.52 minutes in seventeen French dialogues and 232.53 minutes in seventeen Italian dialogues. The data collection on the medical domain involved Italian, English and German languages. A total of 49 dialogues were collected. The recording results in a total of 8 hours 25 minutes of audio files.

Recherche de données OpenAIRE...

Une erreur s’est produite lors de la recherche de données OpenAIRE

Aucun résultat disponible