Skip to main content
Aller à la page d’accueil de la Commission européenne (s’ouvre dans une nouvelle fenêtre)
français français
CORDIS - Résultats de la recherche de l’UE
CORDIS
Contenu archivé le 2024-04-19

Test Suites for NLP Applications

Objectif

As the market for NLP products and services is growing, clear patterns of types of systems start to emerge. In the future, prospective buyers and end-users of NLP products will both be confronted more and more with the problem of choosing the product that best meets their specific requirements. Suppliers of NLP products and services may want to know how their systems and tools compare to those of their competitors. Developers and researchers are likely to be interested in figuring out whether their system performs according to their specifications. At present, companies, institutions and corporate users interested in any of the above-mentioned evaluation types spend a considerable amount of time and effort in building data and tools for their own test purposes. This project aims to alleviate the situation with repect to data and tools, by defining guidelines and a methodology for the construction of diagnostic data ("test suites") and designing and implementing related tools.

The guidelines will be validated by constructing application-specific diagnostic data for French, German and English and testing them on a number of applications ranging from parsers through grammar checkers to controlled language checkers. In addition to devising guidelines, the project is to investigate techniques and design and implement tools that will facilitate the construction, use and manipulation of test suites, such as a database for storing and manipulating test data, and tools for the (semi-)automatic generation of test suites.

Approach and Methodology

The first task of the project will be to survey existing tests suites and draw up specifications for describing them. This survey will help identify the types of NLP applications for which the test suite has been used, and the type of evaluation where it has been applied.

The main thrust of the project will be to set up guidelines for the construction of test suites. Some of the issues which the project will investigate may be application-independent (e.g. size), others may be application-dependent (e.g. the necessity of avoiding examples which involve translational problems when constructing a test suite for monolingual applications). The project will also investigate ways of assigning weightings to test sentences and an efficient annotation scheme. The annotation scheme adopted will be developed with the aim of storing the test suite fragments in a database.

The soundness of the proposed guidelines and methods will be demonstrated by constructing test data for French, German and English, and validating the resulting test suites against a number of applications and/or components.

In addition to devising guidelines and methods for the construction of diagnostic test data, the project intends also to investigate techniques and design and implement tools that will facilite the construction/generation, use and manipulation of test suites.

Firstly, the project will investigate techniques for automatically generating test suites, e.g. by means of special, simple test suite grammars. Test suites are normally hand-constructed. However, this process is difficult (requiring considerable linguistic sophistication and skill), laborious, tedious, and above all error prone. All this suggests that the process is a good candidate for automation, or more precisely, for an interactive process that involves a substantial amount of automation. Another advantage of automation is that this allows "dynamic" test suite construction, where test data can be replaced by new data which test the same phenomena. In this way, it may be possible to overcome one of the problems that sometimes crop up in system evaluation, namely that developers can tune their application so that it deals with static test data. Finally, automatic, dynamic test suite generation should open the possibility of using very large lexicons, perhaps with some "randomisation", thus making it poss ble to hold and transmit extremely large 'virtual' suites, in the order of many millions of sentences, providing standard benchmarks for system testing.

Secondly, the project will investigate whether and to what extent it is possible to derive test suites (semi-) automatically from corpora.

Finally, the project intends to design and develop a relational database for storing and manipulating test suite data. The annotated test suite fragments built during the project will be stored in that database.

Exploitation and Future Prospects

The main result of this project will be the guidelines and the methodology for test suite construction, that can be used in different NLP application fields and systems. It is expected that a set of guidelines will facilitate the interpretation of test suites and enhance their portability. This will be of direct benefit to all those companies and institutions that nowadays spend a considerable amount of time and effort in building test suites for their own purposes.

The results are also likely to be useful for several areas of linguistic research, since they provide a catalogue of linguistic data of potential value to theoretical and empirical work in linguistics.

All project results will become publicly available. The tools will have a high degree of portability, allowing for easy integration into a common framework (e.g. ALEP). The availability of projects results will be widely publicised at conferences, evaluation workshops and in magazines in the field of NLP, in order to create optimal conditions for exploitation by a wide number of users.

Champ scientifique (EuroSciVoc)

CORDIS classe les projets avec EuroSciVoc, une taxonomie multilingue des domaines scientifiques, grâce à un processus semi-automatique basé sur des techniques TLN. Voir: Le vocabulaire scientifique européen.

Vous devez vous identifier ou vous inscrire pour utiliser cette fonction

Programme(s)

Programmes de financement pluriannuels qui définissent les priorités de l’UE en matière de recherche et d’innovation.

Thème(s)

Les appels à propositions sont divisés en thèmes. Un thème définit un sujet ou un domaine spécifique dans le cadre duquel les candidats peuvent soumettre des propositions. La description d’un thème comprend sa portée spécifique et l’impact attendu du projet financé.

Données non disponibles

Appel à propositions

Procédure par laquelle les candidats sont invités à soumettre des propositions de projet en vue de bénéficier d’un financement de l’UE.

Données non disponibles

Régime de financement

Régime de financement (ou «type d’action») à l’intérieur d’un programme présentant des caractéristiques communes. Le régime de financement précise le champ d’application de ce qui est financé, le taux de remboursement, les critères d’évaluation spécifiques pour bénéficier du financement et les formes simplifiées de couverture des coûts, telles que les montants forfaitaires.

Données non disponibles

Coordinateur

University of Essex
Contribution de l’UE
Aucune donnée
Adresse
Wivenhoe Park
CO4 3SQ Colchester
Royaume-Uni

Voir sur la carte

Coût total

Les coûts totaux encourus par l’organisation concernée pour participer au projet, y compris les coûts directs et indirects. Ce montant est un sous-ensemble du budget global du projet.

Aucune donnée

Participants (3)

Mon livret 0 0