This site has been archived on
The Community Research and Development Information Service - CORDIS
Information & Communication Technologies

Language Technologies

Building Babel: Lost in machine translation

Scientists have been trying to automatically translate languages for almost as long as computers have been in existence. So why is it so hard? This is the question posed in an article which appeared recently on the BBC website.

Added 09/03/2012

EPO and Google remove language barriers from patent documentation

The EPO announced its continued collaboration with Google, with the launch of the machine translation service, called Patent Translate , on the EPO's website. Read the full article here .

Added 09/03/2012

New public service for searching scientific papers

The ACL Anthology Searchbench is a new public service for search in scientific publications of Computational Linguistics and Language Technology . The Searchbench provides sentence-semantic, full-text and bibliographic search in more than 22,000 scanned and native digital PDF form papers up to 50 years old.
Users can search with subject-predicate-object statements. Passives, negation and predicate synonyms are resolved in semantic search. Search results can be highlighted in the original PDF with Acrobat. There is also a graphical citation browser that can be reached from the document view in the "Citations" tab. By clicking on the labeled edges between papers, users can see citations in their context. Online help is available.

Added 14/02/2012

Lang Tech News BETA now online!

Lang Tech News is a new, constantly evolving website, aiming to bring together news and information about Language Technologies in Europe and the world. It covers a wide range of domains and topics including technologies, applications and industries. You can follow them on Twitter too @LangTechNews, or subscribe to the daily newsletter. Registration is not required.The

Added 19/01/2012

"The Internet: A lifeboat for endangered languages?"

EurActiv has published an article considering a renewal of linguistic diversity through the Internet.

Added 07/12/2011

SINEQUA has been highlighted as "an Innovative Company to Watch" in the area of Information Access

SINEQUA has been recognised as one of the very few Information Access vendors that are adapting and innovating in the information access market. See the full press release .

Added 22/7/2011

Eurobarometer publishes results of "User language preferences online" survey

In a press release issued today, Commissioner Neelie Kroes said ""If we are serious about making every European digital, we need to make sure that they can understand the web content they want. We are developing new technologies that can help people that cannot understand a foreign language." The results of the Eurobarometer survey show that nearly half of online consumers feel that they are missing interesting information about products available online due to the problem of language barriers. The EC currently manages 30 projects in this domain with funding of ‚¬67 million, and a further ‚¬50 million is to be commited this year. For more information, click on the links below:

User language preferences online - summary pdf.gif (674 KB)

User language preferences online - full report pdf.gif (3.6 MB)

Neelie Kroes' press release pdf.gif (22 KB)

The publication of the survey has hit the headlines in a number of Member States. Use the iTranslate4 translation tool to read the articles!,,15067034,00.html

Chinese-Language Sites growing

( source: Relaxnews ) Facebook was the most visited website throughout the world across the month of January 2011, while the growing presence of Chinese internet users has boosted Chinese-language sites, placing both Baidu and QQ in the top ten, according to Google's Ad Planner . Ad Planner helps companies track which websites are the most visited within certain demographics in order to help companies target their advertisements. Google itself is excluded from the data results. The data for the month of January, released this week, shows that across all countries the top two websites were social network Facebook and video sharing service YouTube. The top two sites in China were search engine and instant messaging site; across all geographical regions these sites ranked sixth and ninth, respectively. Three other Chinese sites - entertainment portal, auction site and video sharing site - ranked just outside of the top ten, in 11th, 12th and 13th place, respectively. The big surprise in the rankings is the absence of online retailer Though the site and its regional subsidiaries appeared in several geographic-specific searches, it was missing from the overall rankings. According to Google Ad Planner, the top ten sites (excluding Google) ranked by audience reach across all demographics and geographical regions for the month of January are:

01. Facebook (social networking)
02. Youtube (video sharing)
03. Yahoo (search engine/email/news)
04. - Windows Live (email provider, search engine, instant messaging)
05. (instant messaging)
06. (Chinese search engine)
07. (blogging site)
08. (software)
09. (Chinese language instant messaging)
10. (search engine)

Added 24/02/2011

DG Education and Culture announces the start of the CELAN project

The Multilingualism sector of DG EAC has launched the CELAN project . CELAN stands for "The Network for the Promotion of Language Strategies for Competitiveness and Employability". Its aim is to encourage dialogue between the business community and language practitioners. It is the outcome of the Business Platform for Multilingualism and starts its activities by launching a blog for regular updates and interaction with readers inside and outside the consortium. Promotion of and participation in the network are strongly encouraged!

Added 24/02/2011

META-NET launches an open consultation

META-NET has been hard at work preparing the groundwork necessary for the Multilingual Europe Technology Alliance and the production of a Strategic Research Agenda for European LT. The Project is currently hosting open discussions on this topic online at their website . Have your say on their work to date and share your visions for the future of LT by contributing to these discussions online and at events organised by META-NET. Keep up to date on their activities and discussions and connect with other Language Technology stakeholders in the community by joining the META alliance at . You can also find them on Facebook and LinkedIn .

Added 16/02/2011

MOSES - an ICT success story

Moses is a statistical machine translation system that allows you to automatically train translation models for any language pair. All you need is a collection of translated texts (parallel corpus). An efficient search algorithm finds quickly the highest probability translation among the exponential number of choices. MOSES' development has been mainly supported by the EuroMatrix and EuroMatrixPlus projects (funded by the EC) and is rapidly proving itself in the machine translation community. TAUS recently reported on the impact MOSES is having:

For more information and articles on MOSES:

MOSES official website

Updated 24/02/2011

US Department of Defense unveils its Human Language Technologies (HLT) Showcase

The Showcase will take place from 10:00 a.m. €“ 4:00 p.m. on Thursday, December 2, 2010 at the Pentagon in Washington, D.C. The Human Language Technology Showcase will feature products and programs relating to machine translation tools, foreign language speech tools, language-in-image technologies, information discovery tools, linguist aids and translation services, and language learning educational tools. Attendees will have the opportunity to network, share ideas, ask questions to industry experts and engage in hands on demonstrations. In addition, informative presentations from experts in government and industry will be offered concurrently with the exhibits.

Added 09/11/2010

Lionbridge and TAUS Data Association (TDA) Collaborate to Bring Real-Time, Live Language Assets to the Cloud

Lionbridge and the TAUS Data Association announced a joint initiative to integrate Lionbridge €™s Translation Workspace „¢ with the TDA language data exchange portal. With this joint innovation, Lionbridge will provide an industry-scale, real time live asset management system for automated, on-demand reuse of previously translated material contained within TDA €™s repository of industry-shared language data. As a result, TDA members can, for the first time, seamlessly maximize the value of pooled language assets and make them operationally accessible as part of their translation process.

A dded 29 October 2010

Internet Governance Forum sets up Dynamic Coalition

The Internet Governance Forum (IDG) now has a Dynamic Coalition for a Global Localization Platform: Localization4all. For more info and a call for members see:

Added 20 July 2010

Google finds perks in its Wikipedia translations

Google translation toolkit allows to specifically upload Wikipedia pages for translation thereby enhancing its own translation technology. Added 19 July 2010

IBM to partner LISA

IBM is setting up a partnership with LISA (Localization Industry Standards Association) to create an open source workbench for professional translators. Added 2 July 2010

Top Languages by Gross Domestic Product

Into which languages should your company translate first? According to this article it depends what your priorities are. Added 27 April 2010

Real-time voice translation coming to mobile

Instant speech translation, a long-time dream of science-fiction writers, is already feasible in certain situations, vendors said at the Mobile Voice Conference in San Francisco. Novauris demonstrated software running on a mobile phone that can instantly translate commonly used phrases, and another company, Fluential, discussed a server-based system that has been used for real-time interpretation in a hospital. Though neither is commercially available yet, both companies said they are technically ready to go. Added 27 April 2010

Localised translation available for Adobe

Lingotek, the leader in collaborative translation technology, announced that Adobe Systems Incorporated has chosen Lingotek's Collaborative Translation Platform (TM) to simplify community and "crowdsourcing" translation projects worldwide. Added 27 April 2010

Real-time voice-to-voice translation on your mobile phone

After Google announced at the beginning of this week that it would be creating the first mobile phone to provide almost instant translation, another company called Sakhr revealed that it has been "pioneering and developing "the world's richest knowledge base for Arabic natural language processing."" While Google is confident that the technology should provide a reasonable speech-to-speech translation result within the next few years, Sakhr is already in co-operation with organisations such as the US Departments of Defense, Homeland Security and Justice who are putting Sakhr's translation service to the test using Blackberry and iPhone platforms.

Sakhr is a partner in the EU-funded MEDAR project. Added on 10 February 2010

Asia Online

Asia Online has published a study on the Impact of Data Consolidation and Sharing for Statistical Machine Translation. This study has been conducted in cooperation with Translation Automation User Society (TAUS) and it highlights the need for shared, clean and normalised data when building statistical machine translation engines. According to Asia Online combining clean and mostly normalised data produces better quality SMT results. While data sharing is becoming an increasingly popular idea among companies and organisations who wish to apply SMT solutions but who alone do not own sufficient volumes of parallel data, it is important to consider the difference that cleaning and nomalising of such shared data can make to the SMT results. The study can be obtained from Asia Online's website:

The Size of the Language Industry in the EU

DG Translation commissioned The Language Technology Centre Ltd to carry out a study to analyse the size of the language industry in the EU, and the results of the study make interesting reading. The general conclusion is that the language industry is feeling the effects of the crisis to a much lesser extent than other sectors. It is estimated that the turnover is approximately ‚¬8.4 billion, and it is expected to continue to grow by around 10% per year for the next few years, making it one of the highest growth rates of all industries in Europe.

The results of the study were presented at a special conference held on 27th November in Brussels. Speakers included the Director General of DG Translation, K.J. Lönnroth, Dr Adriane Rinsche of the Language Technology Centre and Nataly Kelly of the Common Sense Advisory . You can download the full study and watch excerpts of the conference by following the link below:

Related articles

Quite a few associations and organisations have published articles on their websites concerning the study, the first study to analyse the language industry in the EU. This highlights the importance that the language industry plays in Europe.

" Language industry unscathed by economic crisis ", The Trademarks and Designs Registration Office of the EU, 26 November 2009

ACL Anthology Searchbench is a new public service for search in scientific publications of Computational Linguistics and Language Technology

This page is maintained by: (email removed)