The project integrates work in the fields of: automatic retrospective conversion of library catalogues; optical character reader (OCR)/intelligent character reader (ICR) technology; natural language processing. The BIBLIOTECA toolbox will substantially decrease the cost, time and effort involved in creation and update of bibliographic databases, by the substitution of manual analysis and key-board entry with intelligent document reading, using scanning, OCR/ICR and artificial intelligence techniques. The benefits and results of this project include: keyword indexes from table of contents and indexes in books; article databases from content pages in serials; citation indexes from bibliographic references; investigation of the possibility of more advanced intelligent systems for indexing and classification; automatic transformation of card files into standard formats.
Deliverables in the public domain are: detailed and top level work plans; technical reports (text selection criteria, draft framework, field structure, integrity criteria); appraisal (strengths, weaknesses, costs, benefits, performance); testing, examples and results; project reports (including final).
Additional information is available from the website http://www.ucm.es/info/VerbaLogica/biblio.htm.