How can we cope with all the content on the Web and make it available to interested people, regardless of the language(s) they speak and understand? The obvious answer is to teach computers how to understand and process written and spoken human language.
Human Language Technologies (HLT) cover many research groups and disciplines including natural language processing, speech technology, machine translation, information extraction, and so on. If all these strands could be brought together in a meaningful way then perhaps computers could make sense of our many languages.
The European Commission has supported HLT for some 40 years now. There was a lot of sustained effort throughout 1980-1990 which resulted in some pioneering Machine Translation and Translation Memory technologies. The EU support for HLT is now being revived due to renewed political commitment following the enlargement of the EU and new challenges emerging from globalised markets. More and more commercial transactions are being done online and there are more consumers using the Web that do not speak English than those who do. While a few years ago English may have been seen as the lingua-franca of the Web, the amount of online content in other languages has exploded, leaving English-language content covering only 29% of what is available online. Recent e-commerce statistics indicate that two out of three EU customers buy only in their own language. This suggests that language is a significant barrier to a truly Europe-wide digital single market. Of course, language barriers do not only impact on e-commerce activities, but also on access to virtually all online services.
Europe, with its people and skills, and variety of languages accounts for 50% of the worldwide language services market, and the experience and expertise is there to provide tangible results. However, there are several R&D issues which must be addressed in the immediate future in order to better meet the challenge.
This page is maintained by: Susan Fraser