Skip to main content
European Commission logo print header
Contenido archivado el 2022-12-23

GLOSSER

Objetivo

Glosser aims to provide to computerized assistance to intermediate-level language learners.

A substantial barrier to the free flow of ideas and technologies is the linguistic barrier: ideas expressed in a particular language are only accessible to its speakers. Since ever increasing numbers of people encounter texts electronically, automated methods of language processing may be brought to bear to alleviate this problem. Specifically, assistance may take the form of (i) access to bilingual dictionaries; (ii) analysis of grammatical content of morphological information; (iii) access to similar examples in (bilingual) corpora.

For example,we imagine speakers of Estonian, Bulgarian or Hungarian, intermediate language learners/users of English, reading a software manual on the screen: upon encountering an unknown word or an unfamiliar use of known word, e.g "reverts" as in: "This action REVERTS the buffer to the form stored on disk" the user should be able to mouse it to invoke on-line help (follow a hyperlink which is constructed dynamically). The help facility should be prepared to provide: (i) the entry to the word 'revert' (note the morphological 's' must be stripped) in a bilingual Estonian/English, etc. dictionary; (ii) an indication of the morphological content of the word form ('s') indicates that it's 3rd person singular; and (iii) an invitation to seek out other examples of the word in on-line corpora (ideally in bilingual corpora).

A modest prototype demonstrating the capabilities above will be built using rule-based morphological analysis. To obviate the danger that purely morphological methods will discriminate too little in choosing relevant lemmata and word senses, we will employ tagging systems (either rule-based or stochastic) for part-of-speech and word-sense identification. The project will build this prototype while exploring further areas which will require extensions and elaborations:

1) phonic information, including pointers to digital sound recordings in the dictionary entry.
2) constructing useful bilingual correspondences (rough dictionary equivalents) automatically through processing of bilingual corpora.
3) providing entries in a way sensitive to linguistic context--especially for treating compounds ("scroll bar position") and other multi-word lexemes ("keep track of").

Tema(s)

Data not available

Convocatoria de propuestas

Data not available

Régimen de financiación

CSC - Cost-sharing contracts

Coordinador

Rijksuniversiteit Groningen
Aportación de la UE
Sin datos
Dirección
Oude Kijk in 't Jatstraat 26
9700 As Groningen
Países Bajos

Ver en el mapa

Coste total
Sin datos

Participantes (4)