Digital libraries and technology-enhanced learning IMPACT will push innovation in OCR and language technologies for historical document processing and retrieval and build digitisation capacity in Europe
Text that is not digital is virtually invisible. Today's readers search the internet for electronically accessible texts rather than visit the reading room of a library. Born-digital and digitised contemporary materials contain the richness that allows tools such as text mining and the semantic web to offer superior accessibility but the story is very different for historic documents. A vital part of the European heritage, encompassing more than four centuries of historic books and bound periodicals is becoming less and less visible to the public at large.
With the i2010 vision of a European Digital Library, the EU has launched an ambitious plan for large scale digitisation projects transforming Europe's printed heritage into digitally available resources. However, lack of institutional knowledge and expertise slows down the pace with which this vision can be realised. The state of the art in OCR performance and machine understanding of the original document is inadequate, especially for historically important material with archaic fonts and spellings, newspapers with complex layouts, bound volumes, microfilm or typescript.
The IMPACT project will remove many of these barriers. The project will push innovation in OCR technology and language technology for historical document processing and retrieval, and share expertise to build capacity in digitisation across Europe. During the project, a Centre of Competence will be set up in order to provide a central service entry point for all libraries, archives and museums involved in the digitisation of textual material.
The consortium brings together twenty-six national and regional libraries, research institutions and commercial suppliers who will share their know-how and best practices, develop innovative tools to enhance the capabilities of OCR engines and the accessibility of digitised text and lay down the foundations for the mass-digitisation programmes that will take place over the next decade.
Field of science
- /natural sciences/computer and information sciences/internet/semantic web
Call for proposal
See other projects for this call
Funding SchemeCP - Collaborative project (generic)
NW1 2DB London