Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary
Content archived on 2024-05-28

Pattern REcognition-based Statistically Enhanced MT

Project description


Language-based interaction

This proposal describes PRESEMT, a flexible and adaptable MT system, based on a language-independent method, whose principles ensure easy portability to new language pairs. This method attempts to overcome well-known problems of other MT approaches, e.g. bilingual corpora compilation or creation of new rules per language pair. PRESEMT will address the issue of effectively managing multilingual content and is expected to suggest a language-independent machine-learning-based methodology. The key aspects of PRESEMT involve syntactic phrase-based modelling, pattern recognition approaches (such as extended clustering or neural networks) or game theory techniques towards the development of a language-independent analysis, evolutionary algorithms for system optimisation. It is intended to be of a hybrid nature, combining linguistic processing with the positive aspects of corpus-driven approaches, such as SMT and EBMT.In order for PRESEMT to be easily amenable to new language pairs, relatively inexpensive, readily available language resources as well as bilingual lexica will be used. The translation context will be modelled on phrases, as they have been proven to improve the translation quality. Phrases will be produced via a semi-automatic and language-independent process of morphological and syntactic analysis, removing the need of compatible NLP tools per language pair. Parallelisation of the main translation processes will be investigated in order to reach a fast, high-quality translation system. Furthermore, the optimisation and personalisation of the system parameters via automated processes (such as GAs or swarm intelligence) will be studied. To allow for user adaptability, all the corpora used in PRESEMT will be retrieved from web-based sources via the system platform, while the user feedback will be integrated through the use of appropriate interactive interfaces. PRESEMT is expected to be easily customisable to both new language pairs and specific sublanguages.

Call for proposal

FP7-ICT-2009-4
See other projects for this call

Coordinator

ATHINA-EREVNITIKO KENTRO KAINOTOMIAS STIS TECHNOLOGIES TIS PLIROFORIAS, TON EPIKOINONION KAI TIS GNOSIS
EU contribution
€ 738 530,00
Address
ARTEMIDOS 6 KAI EPIDAVROU
151 25 Maroussi
Greece

See on map

Region
Αττική Aττική Βόρειος Τομέας Αθηνών
Activity type
Research Organisations
Administrative Contact
GEORGE TAMBOURATZIS (Dr.)
Links
Total cost
No data

Participants (5)