Skip to main content
CORDIS - Forschungsergebnisse der EU
CORDIS
Inhalt archiviert am 2024-06-18

Analysis of Natural Language for Real World Applications

Ziel

Natural language understanding, information extraction, and machine translation are critical for human-machine interaction and key technologies in many everyday applications (e.g. search engines, mobile devices, robots). Natural language understanding systems transform spoken language or written texts into syntactic and semantic structures. Critically, these systems need to be able to work flexibly on many different text genres.
A critical challenge for current syntactic analyzers in real-world applications is to adapt flexibly to different language domains. This problem arises because current syntactic analyzers are trained primarily on syntactically annotated newspaper texts. In particular, the major syntactic resource for training syntactic analyzers in English is an annotated text collection called the Penn-Tree Bank. The Penn-Tree Bank contains texts from only one genre that is economic news. However, the syntactic analyzers are applied to a wide range of text genres such as emails, newsgroups, blogs, consumer reviews, newspapers with mostly non-economic text, spoken language etc. When applied to these texts the error rate doubles. As a result of a doubled error rate the syntactic analyzer assigns the wrong syntactic structures to the input sentences. In other words, it confuses the subject and object in a sentence. Therefore it is no longer able to answer the critical questions in natural language understanding: Who does what to whom and why and when. For real-world applications this means that the robot may fail to understand the instructions or commands posed by the customer.

The aim of this proposal is to reduce this gap and to provide techniques that obtain a higher accuracy and allow the adaptation to out-of domain genres in an easy and economically acceptable way. Further, in an interdisciplinary and innovative fashion, we will combine syntactic analysis with related analysis techniques from the field of speech recognition.

Aufforderung zur Vorschlagseinreichung

FP7-PEOPLE-2013-CIG
Andere Projekte für diesen Aufruf anzeigen

Koordinator

THE UNIVERSITY OF BIRMINGHAM
EU-Beitrag
€ 100 000,00
Adresse
Edgbaston
B15 2TT Birmingham
Vereinigtes Königreich

Auf der Karte ansehen

Region
West Midlands (England) West Midlands Birmingham
Aktivitätstyp
Higher or Secondary Education Establishments
Kontakt Verwaltung
Xavier Rodde (Mr.)
Links
Gesamtkosten
Keine Daten