Project description
New cross-lingual workflow transfer for large-scale content
Language translation is a complex endeavour. It is hard to find a one-to-one equivalent when transferring content from one language to another as each language has its own system for conveying concepts. The EU-funded WIKOLLECT project will explore this issue by drawing on a synergy between natural language processing, language learning, and crowdsourcing. It will develop a special workflow for large-scale transference of high-quality content across languages. This includes four cyclic steps to automatically identify content in the source language that is missing in the target language and generate potential translations. Applied in Italian and German to Wiktionary, the free-content multilingual online dictionary, this project workflow will promote the fair re-use of content across languages and facilitate knowledge transfer.
Objective
WiKollect aims at creating a workflow for the large-scale transference of high-quality contents across languages. The workflow is divided in four cyclic steps. In step (i) an automatic model will identify contents available in a document in language A which are missing in a document, on the same topic, in language B. In step (ii) candidates to fill the gaps in the document in language B will be automatically generated. In step (iii) such candidates will be subject to manual evaluation by language learners. In step (iv) the contents identified as high-quality will be promoted to fill the gaps in the document in language B. WiKollect will take advantage of the barely-exploited synergy among natural language processing, language learning, and crowdsourcing. To address the different research challenges posed by the workflow design and implementation, it will create an innovative and re-usable hybrid intelligence architecture combining (a) artificial intelligence —such as machine learning and natural language processing— to identify contents worth transferring across languages and generate potential translations and (b) human intelligence —by means of implicit crowdsourcing— relying on a crowd of language learners to flag good contents. WiKollect will create different by-products in addition to the research products that will be generated by addressing each step in the four-step workflow. Language learning exercises on specific topics and complexity levels will be generated. The fair re-use of contents across languages will be promoted with the mass production of high-quality contents. During the MSC period, WiKollect will target the generation of Wiktionary contents in Italian and German. Still, the workflow is flexible and extendable and can be applied to other documents (e.g. Wikipedia articles, news) and languages in the near future.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.
- natural sciences computer and information sciences data science natural language processing
- natural sciences computer and information sciences artificial intelligence machine learning
You need to log in or register to use this function
We are sorry... an unexpected error occurred during execution.
You need to be authenticated. Your session might have expired.
Thank you for your feedback. You will soon receive an email to confirm the submission. If you have selected to be notified about the reporting status, you will also be contacted when the reporting status will change.
Programme(s)
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
Multi-annual funding programmes that define the EU’s priorities for research and innovation.
-
H2020-EU.1.3. - EXCELLENT SCIENCE - Marie Skłodowska-Curie Actions
MAIN PROGRAMME
See all projects funded under this programme -
H2020-EU.1.3.2. - Nurturing excellence by means of cross-border and cross-sector mobility
See all projects funded under this programme
Topic(s)
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.
Funding Scheme
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.
MSCA-IF - Marie Skłodowska-Curie Individual Fellowships (IF)
See all projects funded under this funding scheme
Call for proposal
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.
(opens in new window) H2020-MSCA-IF-2018
See all projects funded under this callCoordinator
Net EU financial contribution. The sum of money that the participant receives, deducted by the EU contribution to its linked third party. It considers the distribution of the EU financial contribution between direct beneficiaries of the project and other types of participants, like third-party participants.
39100 BOLZANO
Italy
The total costs incurred by this organisation to participate in the project, including direct and indirect costs. This amount is a subset of the overall project budget.