EUROpean interface to COCOSDA

Objective

The projects aims to support and coordinate European participation in COCOSDA - the International Coordinating Committee on Speech Databases and Speech Input/Output Systems Assessement. This recent world-level development in language and speech engineering - with representatives from about twenty contries, drawn from Europe, North America, China, Japan and the Pacific Rim area - is concerned with the definition and application of multi-language databases and assessement standards and protocols in the field of Spoken Language Engineering. It is currently based to an important degree on prior European work; there is, however, no organised European presence in its activities. The following objectives are consequently at the core of the EuroCocosda project: give a focus for European work within, and input to, the new world frame provided by Cocosda; contribute to the internationally recognized need for a synergy between the areas of natural language and speech; provide for the joint production of spoken nd written language databases, in order to support cooperation among these areas.

Approach and Methodology

Different action lines will be implemented within the project in order to reach these aims. A unifying infrastructure will be established - EuroCocosda - for concerted European contribution to the main Cocosda world group. Within this framework, direct links will be maintained both with the Cocosda Central Commettee and the three Cocosda Working Groups: Recognition, Synthesis and Corpora. An organisational support foundation for the European component of a world wide telephone speech database - Polyphone - will be provided; the funding for each individual language's 5000 speakers corpus should come from national PTT resources. A coherent framework will be created both for inputs to the Cocosda world speech synthesis database and for the European component of the NL/Speech corpora initiative, NEWS, a multilanguage newspaper text and speech database.

The TED corpus (Translanguage English Database), containing spontaneous speech, read speech and some associated text material, will be directly acquired and and formatted within the project. Two groups of recorded speakers - native speakers of English with different dialects and non-native speakers of English as a foreign language - will provide multi-dialectal and multi-accent speech data. Speakers will be recorded at international conferences on speech communication; the associated text material will be organised and structured in a distributable form and made available to the scientific community at an early stage, thus providing an excellent link between natural language and speech technology. The issue of reusability will be also specifically focused in this data collection effort.

Links between COCOSDA and relevant European and international initiatives (LRE projects SQALE, RELATOR, EAGLES; ELSNET; LDC) will be created or fostered, thus providing an official and regular communication channel for exchanging experiences and data. Present needs of the scientific and user community will be surveyed and future initiatives prepared.

Exploitation and Future prospects

The following benefits and results are foreseen for the project, with special reference to the access from Europe to developments on the world scene and with a substantially enhanced possibility for European norms and de facto standards to become more widely influential and accepted:

A central position for Europe in the international scene, with major initiatives now capable of coming from EC countries acting in concert.
Better coordination between European laboratories, and a world level dimension for the joint activity of European NL and speech based workers.
An insight into world - in particular US and Japanese - technical expertise, coming from direct collaboration on common projects .
The possibility of harmonising national with world actions on database recording and labelling.
The availability of poly-language databases for developing and testing telephone based speech applications in both recognition and synthesis.
The availability of world relevant database frameworks for developing and testing multilingual speech synthesis systems.
A survey on the existence and availability of newspaper databases in various languages; the potential to create and influence the creation of new ones at the international level.
A domain specific multi-language and multi-accent corpus framework, with a large number of speakers, speaking the same language (English) in a natural way, under moderate stress, during a relative large amount of time. The corresponding text material will allow the building of lexica and language models.

Although the project is concrete and small-scale, primarily focused on the precise objective of an urgent need for European coordination within the framework of an international body which already exists, it will however prepare the ground for larger scale operations, whilst stimulating cooperation at the European and international levels.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

FP3-LRE - Specific programme of research and technological development (EEC) in the field of telematic systems in areas of general interest - Linguistic research and engineering -, 1990-1994

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Data not available

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Data not available

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Data not available

Coordinator

University College London

EU contribution

No data

Address

Wolfson House 4 Stephenson Way
NW1 2HE London
United Kingdom

Links

Contact the organisation Website

HORIZON collaboration network

Total cost

No data

Participants (4)

Centre National de la Recherche Scientifique (CNRS)

France

EU contribution

No data

Address

Paris

Total cost

No data

Centro Studi e Laboratori Telecomunicazioni SpA

Italy

EU contribution

No data

LUDWIG-MAXIMILIANS UNIVERSITY OF MUNICH

Germany

EU contribution

No data

Address

Geschwister-Scholl-Platz 1
80539 MUENCHEN

Total cost

No data

Univ.of Amsterdam

Netherlands

EU contribution

No data

Address

Total cost

No data

Objective

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (4)

Share this page Share this page on social networks

Download Download the content of the page

EUROpean interface to COCOSDA

Objective

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (4)

Share this page Share this page on social networks

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.