Community Research and Development Information Service - CORDIS

FP5

DUMAS Report Summary

Project ID: IST-2000-29452
Funded under: FP5-IST
Country: Sweden

Swedish, Finnish and English spoken dialogue corpora

The Swedish spoken corpora of email dialogues collected during the design and evaluation of the AthosMail spoken dialogue system consists of 63 calls to the simulated AthosMail application made by a group of 6 people ranging from 11 seconds to 17 minutes and were collected during a Wizard of Oz study. Approximately 6 hours with 1820 user utterances and 2382 system-based utterances were collected. The dialogues have been annotated using the Annotation Graphs notation (in XML format). The annotations include transcriptions and dialogue acts for each turn in the dialogue.

One part of the Finnish spoken dialogue corpora consist of results from a Wizard of evaluation, another from actual calls to the system by expert users, and rest from the systematic evaluation of the final system. Part of the material has been annotated using the Annotation Graphs notation (in XML format).

The English spoken dialogue corpora of email dialogues consists of 18 dialogues, ranging from 3 to 17 minutes, collected during a Wizard of Oz study. The dialogues have been annotated using the Annotation Graphs notation (in XML format). The annotations include transcriptions and dialogue acts for each turn in the dialogue.

More information on the DUMAS project can be found at:
http://www.sics.se/dumas/

Contact

Björn GAMBÄCK, (Senior Researcher)
Tel.: +46-8-6331500
Fax: +46-8-7517230
E-mail
Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top