Obiettivo As evidenced by a number of machine translation competitions, statistical machine translation is producing encouraging results for language pairs where large corpora of previously translated texts are available for training. However, in practice the availability of such data is often a severe bottleneck. We therefore propose a methodology that only requires a bilingual dictionary and monolingual text corpora of the source and the target language, which should considerably relieve the data acquisition problem. What we suggest is a two stage procedure. In the first step we create a database of translation equivalents by extracting them from a pair of comparable monolingual corpora using a bilingual dictionary in combination with automatically generated thesauri of related words. In the second step we translate new sentences by retrieving appropriate translation equivalents from the database and by merging them using a combinatorial approach. Campo scientifico humanitieslanguages and literaturegeneral language studiesnatural sciencescomputer and information sciencesdatabases Parole chiave computational linguistics corpus linguistics statistical machine translation Programma(i) FP7-PEOPLE - Specific programme "People" implementing the Seventh Framework Programme of the European Community for research, technological development and demonstration activities (2007 to 2013) Argomento(i) PEOPLE-2007-2-1.IEF - Marie Curie Action: "Intra-European Fellowships for Career Development" Invito a presentare proposte FP7-PEOPLE-2007-2-1-IEF Vedi altri progetti per questo bando Meccanismo di finanziamento MC-IEF - Intra-European Fellowships (IEF) Coordinatore UNIVERSITAT ROVIRA I VIRGILI Contributo UE € 207 884,12 Indirizzo CARRER DE ESCORXADOR 43003 Tarragona Spagna Mostra sulla mappa Regione Este Cataluña Tarragona Tipo di attività Higher or Secondary Education Establishments Contatto amministrativo M. Dolores Jimenez Lopez (Dr.) Collegamenti Contatta l’organizzazione Opens in new window Sito web Opens in new window Costo totale Nessun dato