Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano it
CORDIS - Risultati della ricerca dell’UE
CORDIS
Contenuto archiviato il 2024-05-14

DIAGNOSTIC AND EVALUATION TOOLS FOR NATURAL LANGUAGE APPLICATIONS

Obiettivo

DIET addresses requirements for the assessment of natural language processing (NLP) components in adequacy evaluation and quality assurance. Effective and efficient assessment is often hampered by the lack of suitable test material and technology. DIET will develop methods and tools for the glass box evaluation of NLP components, building on the results of previous projects covering different aspects of assessment and evaluation. It will extend and develop test suites with annotated test items for grammar, morphology and discourse, for English, French and German. DIET will provide user support in database technology, test suite construction tools and graphical interfaces. The project results will be used by the industrial partners for in-house and external quality assurance and evaluation. They will also be made available in the public domain through appropriate channels.

Effective and efficient assessment of Natural Language (NL) processing components is often severely hampered by the lack of suitable test material and technology, which are expensive to develop both in time and cost. A large variety of evaluation tools has been produced but these are mainly specialised toward testing specific pieces of software. Developing reusable components is normally outside the interests of individual companies. However, the emerging language technology industry needs these tools, both for industrial developers of NL products, to monitor quality, and for end users so that they can evaluate the suitability of different products.
DiET will address this by developing methods and tools for 'glass box' evaluation of NL components. These tools will be reusable and customisable, together with reference data organised in test suites annotated with items for grammar, morphology and discourse in the English, French and German languages. They will also provide support for standard databases, test suite construction covering specific domains, and graphical interfaces.
Market Situation
Practically every organisation involved in the development or use of NL products has produced its own specialised ad hoc test material and procedures. There are a few which are generalised such as the monolingual test suites developed by Hewlett Packard (1230 English sentences), the Alvey test suite (1500 English sentences) and the Systran test suite (853 German test items with French translations). However none of these come with highly structured annotations or elaborate database technology, but are mostly organised as flat ASCII files, and since the majority of the diagnostic tools and reference data has sprung from research sponsored in the US, American English is mainly used.
Despite this, the foundations for a generalised tool set has already been laid out in the EU projects TSNLP, FraCaS and TEMAA as well as the EAGLES standardisation initiative. The TSNLP project especially, has shown that comprehensive, ready made test data is need by industry, as well as tools for customisation to specific domains and applications. In addition, syntactic and semantic annotation schemes, namely Penn Tree Bank, ParsEval and SemEval, have been developed in the US but have not been extended to any language beyond American English.
Objectives
The main goal is to develop the methods and tools for the glass box evaluation of NL components. This includes the following:
- checking the performance of an NL system against well defined linguistic phenomena identified in real corpora, such as maintenance manuals.
- measurement of the evolution of a system under development through different releases to identify improvements and detect possible degradation.
- enhancement of resources developed in previous projects such as their annotation schema and test suites.
- introduction of techniques which utilise morphology, semantics and discourse as well as tools for domain and corpus based customisation,
- testing the syntactic and semantic competence of an NL system. This would include using test data which shows not only a single but a multiplicity of phenomena.
Technology Base
The technology which is used will involve the construction, annotation and application of systematic NL test suites. These use database and evaluation technologies, as well as statistical and corpus annotation methods, and will be heavily based on the results of the EU funded projects TSNLP, FraCaS, TEMAA, EAGLES, and of the commercially funded project SLT (Spoken Language Translator).
DiET will then extend the state of the art by using:
- several levels of linguistic analysis, including some dialogue and discourse phenomena.
- application specific performance data such as the frequency and relevance of patterns and phenomena in specific domains and applications.
The project will be broken down into:
Data construction: the existing core test suites for English, French and German developed in TSNLP will be extended along several dimensions:
Syntactic construction: existing test suites built in TSNLP will have their gaps filled and their coverage deepened,
Morphological construction: inflectional morphology for the three languages involved will be covered by instances of morpho-syntactic equivalence classes as training material.
Discourse construction: Semantic and especially contextual phenomena will be dealt with to an acceptable level of accuracy.
Database tools: both commercially and freely available tool kits for graphical interfaces and SQL compliant database servers will be surveyed,
Customisation tools: these will cover corpus related lexical replacement and frequency mapping of test items to corpora.
Results
DiET will provide a system consisting of a core of comprehensive diagnostic data together with suitable tools for the testing of NL products. They will be:
- affordable for a broad variety of users.
- augmentable to integrate existing resources.
- adaptable to the specific requirements of individual users.
- widely acceptable as a pre-standard benchmark.
Demonstration
Three of the project partners will assess the DiET tools at their own sites in a series of continuous, iterations. The applications against which they will be tested will be taken from the following set:
- tools and components for machine translation.
- controlled language and grammar checkers.
- translation memory based computer-aided translation systems.
Particular attention will be placed on the following functionality: extraction of subsets of test items from general purpose test data, construction and integration of new test data, customisation of test data for specific applications or corpora and lexical tailoring of the vocabulary used.
DiET will be used to produce high quality, multi-lingual technical documentation by one of the partners, since it will provide a common reference platform to evaluate NL systems. Another partner will use it for their own activities in servicing the localisation industry.
Benefits and Users
There are three different types of users: ones who wish to assess a commercial NL component or product, those who need measurement methods for quality assurance in industrial development, and finally professional users who test NL applications on behalf of other companies or user groups.
End users will profit from better and less expensive NL products which have been evaluated and verified against widely recognised quality standards. Not only large, but also small and medium enterprises will benefit as DiET will make its results widely available. This will enable NL products to reach market faster with a higher quality.
Those parts of the package whose distribution is not restricted will be distributed through the European Linguistic Resource Agency (ELRA) to be made available at a nominal fee.

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: Il Vocabolario Scientifico Europeo.

È necessario effettuare l’accesso o registrarsi per utilizzare questa funzione

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Dati non disponibili

Meccanismo di finanziamento

Meccanismo di finanziamento (o «Tipo di azione») all’interno di un programma con caratteristiche comuni. Specifica: l’ambito di ciò che viene finanziato; il tasso di rimborso; i criteri di valutazione specifici per qualificarsi per il finanziamento; l’uso di forme semplificate di costi come gli importi forfettari.

Dati non disponibili

Coordinatore

DFKI GmbH
Contributo UE
Nessun dato
Indirizzo
Stuhlsatzenhausweg 3
66123 Saarbruecken
Germania

Mostra sulla mappa

Costo totale

I costi totali sostenuti dall’organizzazione per partecipare al progetto, compresi i costi diretti e indiretti. Questo importo è un sottoinsieme del bilancio complessivo del progetto.

Nessun dato

Partecipanti (3)

Il mio fascicolo 0 0