Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

A prototype system for obtaining and managing training data for multilingual learning

CORDIS provides links to public deliverables and publications of HORIZON projects.

Links to deliverables and publications from FP7 projects, as well as links to some specific result types such as dataset and software, are dynamically retrieved from OpenAIRE .

Deliverables

Publications

EXECUTE: A Multilingual Benchmark for LLM Token Understanding (opens in new window)

Author(s): Lukas Edman, Helmut Schmid, Alexander Fraser
Published in: Findings of the Association for Computational Linguistics: ACL 2025, 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.FINDINGS-ACL.95

From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora (opens in new window)

Author(s): Yingli Shen, Wen Lai, Shuo Wang, Ge Gao, Kangyang Luo, Alexander Fraser, Maosong Sun
Published in: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.EMNLP-MAIN.374

Improving Parallel Sentence Mining for Low-Resource and Endangered Languages (opens in new window)

Author(s): Shu Okabe, Katharina Hämmerl, Alexander Fraser
Published in: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.ACL-SHORT.17

Mask and You Shall Receive: Optimizing Masked Language Modeling For Pretraining BabyLMs (opens in new window)

Author(s): Lukas Edman, Alexander Fraser
Published in: Proceedings of the First BabyLM Workshop, 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.BABYLM-MAIN.31

Findings of the WMT 2025 Shared Task LLMs with Limited Resources for Slavic Languages: MT and QA (opens in new window)

Author(s): Shu Okabe, Daryna Dementieva, Marion Di Marco, Lukas Edman, Katharina Haemmerl, Marko Měškank, Anita Hendrichowa, Alexander Fraser
Published in: Proceedings of the Tenth Conference on Machine Translation, 2025
Publisher: Association for Computational Linguistics
DOI: 10.18653/V1/2025.WMT-1.27

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available

My booklet 0 0