Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS
Contenido archivado el 2024-04-19

RELATOR

Objetivo

The language industries of the future will rely heavily on the availability of large scale language resources e.g. corpora, speech databases, dictionaries, linguistic descriptions -- together with appropriate standards and methodologies. Ready access to harmonised databases of language data and rules would not only provide a direct benefit to research and development efforts across a wide range of private and public organisations, but would also foster fruitful academic and industrial co-operation.

The project aims to define a broad organisational framework for the creation of the language resources for both written and spoken language engineering (LRs in short) which are necessary for the development of an adequate language technology and industry in Europe, and to determine the feasibility of creating a co-ordinated European network of repositories which would perform the function of storing, disseminating and maintaining such resources. This activity is intended to contribute towards the long term goal of making large scale LRs widely available to European organisations involved in R&D and educational activities.

Approach and Methodology

The overall approach and the results which the project intends to achieve can be summarised as follows:

to create structured, publicly available catalogues of existing linguistic resources, using and extending the information already collected by various international and national survey initiatives;
to evaluate the present European situation, comparing what is available with the most urgent needs of the European R&D and teaching communities, and then to formulate recommendations for a concerted European action in the field of reusable resources for natural language and speech;
to discuss with the relevant actors (e.g. owners of resources, producers, private and public users, funding bodies, scientific and professional associations) the various aspects of the problem, their needs and requirements, the possible solutions, their willingness to co-operate, and the conditions for a joint European action;
to identify, describe and evaluate at various levels (e.g. organisational, technical, legal) alternative methods and structures which could ensure the creation, management and maintenance of a European repository of reusable LRs, and their dissemination to the various types of users;
to experiment with the collection and dissemination of existing LRs using (i) a distributed electronic network and (ii) CD-ROM pressing facilities, with the aim of encouraging the reuse of already available resources, and also of acquiring experience which will feed into the formulation of final recommendations;
to present final recommendations for establishing a collaborative infrastructure that will act as a collection, verification, management and dissemination centre, built on the foundation provided by existing European structures and organisations.

Assessing Existing Resources: carrying out a review of what LRs currently exist, both in Europe an elsewhere. The goal of this survey is not to produce a comprehensive, exhaustive catalogue of such resources, but rather to assess which needs of the various European languages are still not satisfied by the available resources, and to compare and characterise the situations of the different languages. The results of this evaluation effort will provide the basis for the general recommendations (see below).

Needs Analysis: determining the main resource needs of European actors involved in RTD training and system development; discussing the various aspects (e.g. legal, financial, organisational problems; participation and role of different types of public and private actors) of the actions required to meet the needs for LRs in Europe, as a basis for defining an overall organisational framework for the development of adequate LRs in Europe.

Experimental Implementation: testing the usefulness and feasibility of a distributed resource repository by implementing an infrastructure on which will be mounted a set of LRs; in particular we will experiment with the dissemination of LRs using ELSNET's existing infrastructure for LRs: (i) a wide-area network running the AFS server software, and (ii) the formatting, mastering and distributing of data by CD-ROM.

Recommendations: making detailed recommendations for the creation, management, and maintenance of a distributed, managed repository of reusable LRs, based on a detailed analysis and evaluation of the alternatives.

Exploitation and Future Prospects

The goal of the project is the co-ordinated collection and distribution of LRs, promoting awareness of the need for creating widely available LRs, and the promotion of consensus on an overall European strategy. Consequently, dissemination activities are central to the project. The project consortium comprises representatives of major European-wide bodies and associations, most notably ELSNET, ESCA and EACL, and will be assisted by an industrial steering committee composed of representatives of leading IT companies, publishers, PTTs and other providers of electronic information services.

The action will be carried out in co-operation with relevant European groups and with on-going initiatives such as EAGLES, and will imply amongst other things an analysis of existing international structures. It is expected that the experimental activities carried out within the project and the recommendations for further larger-scale operations will contribute to the establishment of a broad language infrastructure covering all Community languages.

Ámbito científico (EuroSciVoc)

CORDIS clasifica los proyectos con EuroSciVoc, una taxonomía plurilingüe de ámbitos científicos, mediante un proceso semiautomático basado en técnicas de procesamiento del lenguaje natural. Véas: El vocabulario científico europeo..

Para utilizar esta función, debe iniciar sesión o registrarse

Programa(s)

Programas de financiación plurianuales que definen las prioridades de la UE en materia de investigación e innovación.

Tema(s)

Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.

Datos no disponibles

Convocatoria de propuestas

Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.

Datos no disponibles

Régimen de financiación

Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.

Datos no disponibles

Coordinador

Università degli Studi di Pisa
Aportación de la UE
Sin datos
Dirección
Via della Faggiola 32
56100 Pisa
Italia

Ver en el mapa

Coste total

Los costes totales en que ha incurrido esta organización para participar en el proyecto, incluidos los costes directos e indirectos. Este importe es un subconjunto del presupuesto total del proyecto.

Sin datos

Participantes (4)

Mi folleto 0 0