Skip to main content
Vai all'homepage della Commissione europea (si apre in una nuova finestra)
italiano italiano
CORDIS - Risultati della ricerca dell’UE
CORDIS
Contenuto archiviato il 2024-05-27

Developing Multilingual Web-scale Language Technologies

Obiettivo

This project aims to provide meaning to the web. MEANING will enhance current web applications by automatically increasing the linguistic depth and breath of existing multilingual resources and by devising improved concept-based Natural Language Processing (NLP) technologies using those resources. Current web access applications are based on words; MEANING will open the way for access to the Multilingual Web based on concepts, providing applications with capabilities that significantly exceed those currently available. MEANING will facilitate development of concept-based open domain Internet applications. Furthermore, MEANING will supply a common conceptual structure to Internet documents, thus facilitating knowledge management of web content. This project aims to provide meaning to the web. MEANING will enhance current web applications by automatically increasing the linguistic depth and breath of existing multilingual resources and by devising improved concept-based Natural Language Processing (NLP) technologies using those resources. Current web access applications are based on words; MEANING will open the way for access to the Multilingual Web based on concepts, providing applications with capabilities that significantly exceed those currently available. MEANING will facilitate development of concept-based open domain Internet applications. Furthermore, MEANING will supply a common conceptual structure to Internet documents, thus facilitating knowledge management of web content.

OBJECTIVES
To be able to build the next generation of intelligent open domain HLT application systems we need to solve two complementary intermediate tasks: Word Sense Disambiguation (WSD) and large-scale enrichment of Lexical Knowledge Bases. WSD is the task of assigning the appropriate meaning (sense) to a given word in a text or discourse. And this is one of the most important open problems in NLP. However, progress is difficult due to the following paradox:
1) In order to enrich Lexical Knowledge Bases we need to acquire information from corpora, which have been accurately tagged with word senses.
2) In order to achieve accurate WSD, we need far more linguistic and semantic knowledge than is available in current lexical knowledge bases (e.g. WordNets). The major objective of MEANING is to provide innovate technology to solve this problem.

DESCRIPTION OF WORK
MEANING will develop concept- based technologies and resources through large-scale processing over the web, robust and fast machine learning algorithms, very large lexical resources and new strategies for combining them. MEANING will treat the web as a (huge) corpus to learn information from, since even the largest conventional corpora available (e.g. the British National Corpus) are not large enough to be able to acquire reliable information in sufficient detail about language behaviour. Moreover, most European languages do not have large or diverse enough corpora available. We will use a combination of Machine Learning and novel Knowledge-Based techniques in order to enrich the structure of the WordNets in different domains (subsets of the web) in five European languages: English, Italian, Spanish, Catalan and Basque. MEANING will produce:
a) A Tool Set that using the semantic knowledge of EuroWordNet will obtain automatically from the web large collections of examples and for each particular word sense.
b) A Tool Set for enriching EuroWordNet using the knowledge acquired automatically from the Web.
c) A Tool Set for selecting accurately the senses of the open-class words for the languages involved in the project. MEANING will also develop a Multilingual Central Repository to maintain compatibility between WordNets of different languages and versions, past and new. The acquired knowledge from each language will be consistently uploaded to the Multilingual Central Repository and ported over to the local WordNets involved in the project. MEANING will also produce a semantically annotated corpus for each WordNet word sense, that is, a Multilingual Web corpus with semantically annotated corpora containing concept and domain labels.

Campo scientifico (EuroSciVoc)

CORDIS classifica i progetti con EuroSciVoc, una tassonomia multilingue dei campi scientifici, attraverso un processo semi-automatico basato su tecniche NLP. Cfr.: https://op.europa.eu/it/web/eu-vocabularies/euroscivoc.

È necessario effettuare l’accesso o registrarsi per utilizzare questa funzione

Programma(i)

Programmi di finanziamento pluriennali che definiscono le priorità dell’UE in materia di ricerca e innovazione.

Argomento(i)

Gli inviti a presentare proposte sono suddivisi per argomenti. Un argomento definisce un’area o un tema specifico per il quale i candidati possono presentare proposte. La descrizione di un argomento comprende il suo ambito specifico e l’impatto previsto del progetto finanziato.

Dati non disponibili

Invito a presentare proposte

Procedura per invitare i candidati a presentare proposte di progetti, con l’obiettivo di ricevere finanziamenti dall’UE.

Dati non disponibili

Meccanismo di finanziamento

Meccanismo di finanziamento (o «Tipo di azione») all’interno di un programma con caratteristiche comuni. Specifica: l’ambito di ciò che viene finanziato; il tasso di rimborso; i criteri di valutazione specifici per qualificarsi per il finanziamento; l’uso di forme semplificate di costi come gli importi forfettari.

Dati non disponibili

Coordinatore

UNIVERSITAT POLITECNICA DE CATALUNYA
Contributo UE
Nessun dato
Indirizzo
JORDI GIRONA 31
08034 BARCELONA
Spagna

Mostra sulla mappa

Costo totale

I costi totali sostenuti dall’organizzazione per partecipare al progetto, compresi i costi diretti e indiretti. Questo importo è un sottoinsieme del bilancio complessivo del progetto.

Nessun dato

Partecipanti (3)

Il mio fascicolo 0 0