State of the art leading-edge techniques for paper document scanning and optical text recognition will be investigated to determinable investigated to determine their feasibility for retrieving information accumulated in personal records. As an example we choose files of prisoners of former Nazi concentration camps. A modified scanning equipment based on optical filtering including Near Infrared (NIR) will be tested and experimental Digital Document Workbench software will be developed to produce highly interactive, editable and linkable electronic documents, suitable for the future Web-based virtual memorial. In the result advanced digital imaging technologies will be introduced to archive routine operations for digital preservation, better content recognition, more easy access and retrieval of information in public files.
The proposal MEMORIAL has the following scientific, technological and application objectives:
1. Improve image processing and pattern recognition methods and tools to enable direct extraction of historic information from paper documents of former Nazi concentration camps;
2. Develop an electronic document format suitable for storage, search and retrieval of such documents in a future virtual memorial;
3. Investigate legal, social, ethical and political conditions for creating digital libraries of genocide information;
4. Initiate a pan-European infrastructure of digital libraries providing virtual memorial services for individual users and research organisations.%l%lDESCRIPTION OF WORK
Memorial will develop a new technology for digitisation of paper documents. It will integrate modified scanning equipment based optical filtering including Near Infrared, and an improved set of image analysis software tools, into a novel Digital Document Workbench (DDW) facility. DDW will be tested on meaningful samples of real documents selected from partner archives by a team of technical and historical experts. Documents will belong to an camp, important and reach class of personal records of prisoners of former Nazi concentration camps, common to many memorial sites across Europe and other archives world-wide. Selected documents will be scanned and digital images processed to the point where portions of text can be recognised and converted to the interactive electronic form. DDW technology will be validated in a real user context provided by specially developed public demo Web-site. This site will be build using common Web technologies to test local user access modes to electronic documents produced by DDW, with special emphasis on state-of-the-art human- computer interaction, digital library storage, querying and retrieval by remote clients. Research efforts of the consortium will concentrate on scanning techniques and equipment, segmentation of graphical images from textual ones, optical and intelligent character recognition of textual images, background cleaning of noise and over-stricken machine typed characters, human editor interaction to aid the separation and recognition process, and evaluation criteria to monitor document quality at all stages of its processing. The project workplan distinguishes technological and application workpackages. Technological workpackages are aimed at achieving significant advances beyond the current state of the art in digitisation of paper based public record documents, while application implementation workpackages will make sure that final DDW product can meet market expectations of the commercial consortium partners.
Funding SchemeCSC - Cost-sharing contracts
61572 Tel Aviv
37850 M.p. Menashe
L69 3BX Liverpool