Skip to main content
Go to the home page of the European Commission (opens in new window)
English English
CORDIS - EU research results
CORDIS

High Performance Language Technologies

CORDIS provides links to public deliverables and publications of HORIZON projects.

Links to deliverables and publications from FP7 projects, as well as links to some specific result types such as dataset and software, are dynamically retrieved from OpenAIRE .

Deliverables

Initial release of monolingual and parallel data sets (opens in new window)

This deliverable consists of initial set of textual data acquired from web and non-web sources, both in monolingual and parallel parts, after cleaning done in WP2.

Software for cleaning data sets (opens in new window)

Free and open-source software will be released on GitHub.

First language models trained (opens in new window)

Language models will be made available for download however it may not have all or the cleanest data.

Translation models for select language pairs (opens in new window)

Models available for download trained using the pipeline.

Publications

FinGPT: Large Generative Models for a Small Language (opens in new window)

Author(s): Luukkonen, Risto; Komulainen, Ville; Luoma, Jouni; Eskelinen, Anni; Kanerva, Jenna; Kupari, Hanna-Mari; Ginter, Filip; Laippala, Veronika; Muennighoff, Niklas; Piktus, Aleksandra; Wang, Thomas; Tazi, Nouamane; Scao, Teven Le; Wolf, Thomas; Suominen, Osma; Sairanen, Samuli; Merioksa, Mikko; Heinonen, Jyrki; Vahtola, Aija; Antao, Samuel; Pyysalo, Sampo
Published in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, ISBN 979-8-89176-060-8
Publisher: Association for Computational Linguistics
DOI: 10.48550/arxiv.2311.05640

Towards Interpretable Mental Health Analysis with Large Language Models (opens in new window)

Author(s): Yang, Kailai; Ji, Shaoxiong; Zhang, Tianlin; Xie, Qianqian; Kuang, Ziyan; Ananiadou, Sophia
Published in: Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023, ISBN 979-8-89176-060-8
Publisher: Association for Computational Linguistics
DOI: 10.48550/arxiv.2304.03347

Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca (opens in new window)

Author(s): Chen, Pinzhen; Ji, Shaoxiong; Bogoychev, Nikolay; Kutuzov, Andrey; Haddow, Barry; Heafield, Kenneth
Published in: EACL, 2023, ISBN 979-8-89176-088-2
Publisher: Association for Computational Linguistics
DOI: 10.48550/arxiv.2309.08958

Scaling Data-Constrained Language Models (opens in new window)

Author(s): Muennighoff, Niklas; Rush, Alexander M.; Barak, Boaz; Scao, Teven Le; Piktus, Aleksandra; Tazi, Nouamane; Pyysalo, Sampo; Wolf, Thomas; Raffel, Colin
Published in: 2023, ISSN 2331-8422
Publisher: NeurIPS'23
DOI: 10.48550/arxiv.2305.16264

Searching for OpenAIRE data...

There was an error trying to search data from OpenAIRE

No results available

My booklet 0 0