Please note that the project factsheets will no longer be updated. All information relevant to the project can be found on the CORDIS factsheet . This is updated on a regular basis with public deliverables, etc.
FAUST - Feedback for User adaptive Statistical Translation
247762 - STREP
At a glance
ICT-2009.2.2 - Language based interaction
Unlike human beings, current machine translation (MT) systems do not learn from their mistakes. They are in particular unable to improve the translation fluency by exploiting any feedback provided by the end users. Major challenges to overcome are to automatically identify user feedback of value and, to elaborate mechanisms immediately affecting the behaviour of the statistical MT system so that subsequent users do not face similar problems
As translation technology is brought ever closer to its community of users, there is strong potential for creating collaborative interaction between translators, casual active users, and technology developers. However, the limited experience to date in this area shows little success. For example, on certain commercial sites millions of text passages are translated monthly and users have the facility to provide feedback and suggest means for improving the automatic translation of any given sentence (a similar feedback mechanism is implemented by Google). Unfortunately, this feedback cannot be exploited for three reasons:
- User feedback is noisy, with more than 90% of the suggestions being erroneous, incomplete, or ambiguous.
- No research published to date makes explicit how statistical translation and language models can be adapted to benefit from the explicit and/or implicit feedback provided by web users.
- No mechanisms exist for identifying the user feedback of value (high confidence) or for immediately affecting the behaviour of a statistical MT system so that subsequent users do not run into the same problem.
As a consequence, casual users of translation are relegated to a passive role in the translation process, even if these casual users are active content producers.
The project aims at improving the fluency and the performance of leading commercial MT systems, Language Weaver and Reverso, through real-time exploitation of user feedback.
The project will deploy web-oriented feedback collection mechanisms that automatically identify feedback of good quality, automatically acquire data collections to study translation as informed by the user feedback, develop mechanisms for incorporating user feedback into the MT engines, create novel automatic metrics of translation quality and integrate natural language generation directly into MT to improve translation fluency.
The developed system will go beyond current mechanisms of feedback exploitation because it will handle improvement suggestions of users at sub-sentence level and will enable to process them automatically, hence at a much larger scale. This is made possible through the automatic identification of feedback of good quality, by using collaborative filtering techniques.
The project will develop a new technology for improving statistical MT systems and specify metrics of translation quality that reflect preferences learned from user feedback.
Improving the fluency of MT systems translations will ease the acceptance of machine translation among everyday users, and hence their usage and deployment.
Name: Keith Cann
Tel: +44 1223 333543
Fax: +44 1223 332988
Organisation: University of Cambridge
This page is maintained by: Susan Fraser (email removed)