This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies


Back to overview

Please note that the project factsheets will no longer be updated.  All information relevant to the project can be found on the CORDIS factsheet .  This is updated on a regular basis with public deliverables, etc.

FAUST - Feedback for User adaptive Statistical Translation
247762 - STREP

faust-logo.gif

At a glance

ICT-2009.2.2 - Language based interaction

Unlike human beings, current machine translation (MT) systems do not learn from their mistakes. They are in particular unable to improve the translation fluency by exploiting any feedback provided by the end users. Major challenges to overcome are to automatically identify user feedback of value and, to elaborate mechanisms immediately affecting the behaviour of the statistical MT system so that subsequent users do not face similar problems

Challenge

As translation technology is brought ever closer to its community of users, there is strong potential for creating collaborative interaction between translators, casual active users, and technology developers. However, the limited experience to date in this area shows little success. For example, on certain commercial sites millions of text passages are translated monthly and users have the facility to provide feedback and suggest means for improving the automatic translation of any given sentence (a similar feedback mechanism is implemented by Google). Unfortunately, this feedback cannot be exploited for three reasons:

  • User feedback is noisy, with more than 90% of the suggestions being erroneous, incomplete, or ambiguous.
  • No research published to date makes explicit how statistical translation and language models can be adapted to benefit from the explicit and/or implicit feedback provided by web users.
  • No mechanisms exist for identifying the user feedback of value (high confidence) or for immediately affecting the behaviour of a statistical MT system so that subsequent users do not run into the same problem.

As a consequence, casual users of translation are relegated to a passive role in the translation process, even if these casual users are active content producers.

Goal

The project aims at improving the fluency and the performance of leading commercial MT systems, Language Weaver and Reverso, through real-time exploitation of user feedback.

The project will deploy web-oriented feedback collection mechanisms that automatically identify feedback of good quality, automatically acquire data collections to study translation as informed by the user feedback, develop mechanisms for incorporating user feedback into the MT engines, create novel automatic metrics of translation quality and integrate natural language generation directly into MT to improve translation fluency.

Scientific Innovation

The developed system will go beyond current mechanisms of feedback exploitation because it will handle improvement suggestions of users at sub-sentence level and will enable to process them automatically, hence at a much larger scale. This is made possible through the automatic identification of feedback of good quality, by using collaborative filtering techniques.

The result

The project will develop a new technology for improving statistical MT systems and specify metrics of translation quality that reflect preferences learned from user feedback.

Impact

Improving the fluency of MT systems translations will ease the acceptance of machine translation among everyday users, and hence their usage and deployment.

Co-ordinator

Contact Person:

Name: Keith Cann

Tel: +44 1223 333543

Fax: +44 1223 332988

E-mail: ecapplications@rsd.cam.ac.uk

Organisation: University of Cambridge

More»

Participants



















Back to overview



This page is maintained by: Susan Fraser (email removed)