Service Communautaire d'Information sur la Recherche et le Développement - CORDIS

FP7 TranScriptorium

tranScriptorium

Project reference: 600707
Funded under

tranScriptorium

From 2013-01-01 to 2015-12-31, closed project

Project details

Total cost:

EUR 3 005 570

EU contribution:

EUR 2 399 739

Coordinated in:

Spain

Call for proposal:

FP7-ICT-2011-9See other projects for this call

Funding scheme:

CP - Collaborative project (generic)

The aim of tranScriptorium is to develop innovative, cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic HTR tech

tranScriptorium aims to develop innovative, efficient and cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic Handwritten Text Recognition (HTR) technology. TranScriptorium will turn HTR into a mature technology by addressing the following objectives:

  • Enhancing HTR technology for efficient transcription
    Departing from state-of-the-art HTR approaches, tranScriptorium will capitalize on interactive-predictive techniques for effective and user-friendly computer-assisted transcrition.
  • Bringing the HTR technology to users
    Expected users of the HTR technology belong mainly to two groups: a) individual reserachers with experience in handwritten documents transcription interested in transcribing specific documents. b) volunteers which collaborate in large transcription projects.
  • Integrating the HTR results in public web portals
    The HTR technology will become a support in the digitization of the handwritten materials. The outcomes of the tranScriptorium tools will be attached to the published handwritten document images. This includes not only full, correct transcriptions, but also partially correct transcription and other kinds of automatically produced metadata, useful for indexing and searching.

Objective

Huge amounts of handwritten historical documents are being published by on-line digital libraries world wide. However, for these raw digital images to be really useful, they need be annotated with informative content. The tranScriptorium project aims to develop innovative, efficient and cost-effective solutions for the indexing, search and full transcription of historical handwritten document images, using modern, holistic Handwritten Text Recognition (HTR) technology. For typical handwritten text images of historical documents, currently available text image recognition technologies are not suitable. Traditional Optical Character Recognition (OCR) is simply not usable since characters can not be isolated automatically in these images. Therefore, holistic, segmentation-free HTR techniques, often borrowed from the field of Automatic Speech Recognition are needed. Yet, state-of-the-art holistic HTR approaches still lack the required accuracy, mainly due to the usual poor quality, degradations and writing style variability of historical document images. To cope with this lack of recognition accuracy for handwritten text images of historical documents, three actions are planned in tranScriptorium: i) improve basic image preprocessing and holistic HTR techniques; ii) develop novel indexing and keyword searching approaches, mainly based on byproducts of holistic HTR decoding and word spotting techniques; and iii) capitalize on new, user-friendly interactive-predictive HTR approaches for computer-assisted operation, which minimize the user intervention needed to achieve full, high quality transcripts. HTR tools based on tranScriptorium techniques will be incorporated into HTR web platforms that will be accessible to users through two different means: i) a content provider portal that provides access to handwritten historical documents for casual, individual researchers; and b) a specialized HTR web portal for structured crowd-sourcing transcription projects.

Related information

Documents and Publications

Open Access

Coordinator

UNIVERSITAT POLITECNICA DE VALENCIA
Spain

EU contribution: EUR 513 836


CAMINO DE VERA S/N
46022 VALENCIA
Spain
Administrative contact: José Antonio Pérez García
Tel.: +34 96 387 7409
Fax: +34 96 387 7949
E-mail

Participants

UNIVERSITAET INNSBRUCK
Austria

EU contribution: EUR 369 700


INNRAIN
6020 INNSBRUCK
Austria
Administrative contact: Günter Mühlberger
Tel.: +43 512 5078454
E-mail
NATIONAL CENTER FOR SCIENTIFIC RESEARCH "DEMOKRITOS"
Greece

EU contribution: EUR 513 812


Patriarchou Gregoriou Str.
15310 AGHIA PARASKEVI
Greece
Administrative contact: Evripides Papadopoulos
Tel.: +30 210 6503037
Fax: +30 210 6522623
E-mail
INSTITUUT VOOR NEDERLANDSE LEXICOLOGIE
Netherlands

EU contribution: EUR 493 040


Matthias de Vrieshof 2-3, 2311 BZ
2300RA LEIDEN
Netherlands
Administrative contact: Katrien Depuydt
Tel.: +31 715272479
Fax: +31 715272115
E-mail
UNIVERSITY OF LONDON
United Kingdom

EU contribution: EUR 214 900


Malet Street, Senate House
WC1E 7HU LONDON
United Kingdom
Administrative contact: Richard Davis
Tel.: +44 20 7863 1350
E-mail
UNIVERSITY COLLEGE LONDON
United Kingdom

EU contribution: EUR 294 451


Gower Street
WC1E 6BT LONDON
United Kingdom
Administrative contact: Greta Borg-Carbott
Tel.: +44 20 3108 3033
E-mail
Record Number: 106843 / Last updated on: 2016-08-02