Objetivo
The projects aims to support and coordinate European participation in COCOSDA - the International Coordinating Committee on Speech Databases and Speech Input/Output Systems Assessement. This recent world-level development in language and speech engineering - with representatives from about twenty contries, drawn from Europe, North America, China, Japan and the Pacific Rim area - is concerned with the definition and application of multi-language databases and assessement standards and protocols in the field of Spoken Language Engineering. It is currently based to an important degree on prior European work; there is, however, no organised European presence in its activities. The following objectives are consequently at the core of the EuroCocosda project: give a focus for European work within, and input to, the new world frame provided by Cocosda; contribute to the internationally recognized need for a synergy between the areas of natural language and speech; provide for the joint production of spoken nd written language databases, in order to support cooperation among these areas.
Approach and Methodology
Different action lines will be implemented within the project in order to reach these aims. A unifying infrastructure will be established - EuroCocosda - for concerted European contribution to the main Cocosda world group. Within this framework, direct links will be maintained both with the Cocosda Central Commettee and the three Cocosda Working Groups: Recognition, Synthesis and Corpora. An organisational support foundation for the European component of a world wide telephone speech database - Polyphone - will be provided; the funding for each individual language's 5000 speakers corpus should come from national PTT resources. A coherent framework will be created both for inputs to the Cocosda world speech synthesis database and for the European component of the NL/Speech corpora initiative, NEWS, a multilanguage newspaper text and speech database.
The TED corpus (Translanguage English Database), containing spontaneous speech, read speech and some associated text material, will be directly acquired and and formatted within the project. Two groups of recorded speakers - native speakers of English with different dialects and non-native speakers of English as a foreign language - will provide multi-dialectal and multi-accent speech data. Speakers will be recorded at international conferences on speech communication; the associated text material will be organised and structured in a distributable form and made available to the scientific community at an early stage, thus providing an excellent link between natural language and speech technology. The issue of reusability will be also specifically focused in this data collection effort.
Links between COCOSDA and relevant European and international initiatives (LRE projects SQALE, RELATOR, EAGLES; ELSNET; LDC) will be created or fostered, thus providing an official and regular communication channel for exchanging experiences and data. Present needs of the scientific and user community will be surveyed and future initiatives prepared.
Exploitation and Future prospects
The following benefits and results are foreseen for the project, with special reference to the access from Europe to developments on the world scene and with a substantially enhanced possibility for European norms and de facto standards to become more widely influential and accepted:
A central position for Europe in the international scene, with major initiatives now capable of coming from EC countries acting in concert.
Better coordination between European laboratories, and a world level dimension for the joint activity of European NL and speech based workers.
An insight into world - in particular US and Japanese - technical expertise, coming from direct collaboration on common projects .
The possibility of harmonising national with world actions on database recording and labelling.
The availability of poly-language databases for developing and testing telephone based speech applications in both recognition and synthesis.
The availability of world relevant database frameworks for developing and testing multilingual speech synthesis systems.
A survey on the existence and availability of newspaper databases in various languages; the potential to create and influence the creation of new ones at the international level.
A domain specific multi-language and multi-accent corpus framework, with a large number of speakers, speaking the same language (English) in a natural way, under moderate stress, during a relative large amount of time. The corresponding text material will allow the building of lexica and language models.
Although the project is concrete and small-scale, primarily focused on the precise objective of an urgent need for European coordination within the framework of an international body which already exists, it will however prepare the ground for larger scale operations, whilst stimulating cooperation at the European and international levels.
Ámbito científico (EuroSciVoc)
CORDIS clasifica los proyectos con EuroSciVoc, una taxonomía plurilingüe de ámbitos científicos, mediante un proceso semiautomático basado en técnicas de procesamiento del lenguaje natural. Véas: El vocabulario científico europeo..
CORDIS clasifica los proyectos con EuroSciVoc, una taxonomía plurilingüe de ámbitos científicos, mediante un proceso semiautomático basado en técnicas de procesamiento del lenguaje natural. Véas: El vocabulario científico europeo..
- humanidades lenguas y literatura estudios generales del lenguaje
- ciencias naturales informática y ciencias de la información base de datos
Para utilizar esta función, debe iniciar sesión o registrarse
Le pedimos disculpas, pero se ha producido un error inesperado durante la ejecución.
Necesita estar autentificado. Puede que su sesión haya finalizado.
Gracias por su comentario. En breve recibirá un correo electrónico para confirmar el envío. Si ha seleccionado que se le notifique sobre el estado del informe, también se le contactará cuando el estado del informe cambie.
Programa(s)
Programas de financiación plurianuales que definen las prioridades de la UE en materia de investigación e innovación.
Programas de financiación plurianuales que definen las prioridades de la UE en materia de investigación e innovación.
Tema(s)
Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.
Datos no disponibles
Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.
Convocatoria de propuestas
Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.
Datos no disponibles
Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.
Régimen de financiación
Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.
Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.
Datos no disponibles
Coordinador
NW1 2HE London
Reino Unido
Los costes totales en que ha incurrido esta organización para participar en el proyecto, incluidos los costes directos e indirectos. Este importe es un subconjunto del presupuesto total del proyecto.