|
|
Telematics for Libraries - Project
FACIT
Updated: 15 MAR 99
| Project Number and Title |
| 1044 - Fast Automatic Conversion with Integrated Tools: OCR/ICR in retroconversion of catalogues - automatic error detection/correction and formatting |
| Programme/Action line |
Call Topic(s) |
Start |
End |
Project Duration in Months |
| FP 3/ I |
Theme 4 |
January 1993 |
February 1996 |
36 |
| Keywords |
| OCR/ICR; retrospective conversion; library catalogue cards; automatic formatting; character recognition |
Additional information is available from the FACIT web site . (http://www.komm.ruc.dk/FACIT/)
- Theme
- Use of OCR/ICR for RECON (Theme: 4)
- Project description
- The project aimed to produce a working prototype for automatic error detection and correction and automatic formatting of catalogue cards, based on and matching high-volume, fast throughput from the scanning process.
- The conversion to machine-readable form of library catalogue cards confronts issues relating to: quantities of cards; the physical properties of the cards, e.g. size and variable print quality; the variations in the format and content. This project was based on previous work which developed a customised scanner, with a bulk card feed, for fast conversion to ASCII of catalogue card data. It extended the automatic processing to include error correction of typical, analysed errors and to provide automatic formatting of the scanned data to UNIMARC. Using input from libraries in Denmark, Italy and Greece a prototype workstation were developed and tested. At the same time, the project was the means for establishing centres of relevant experience and knowledge in these countries.
- Technical approach
- The project was organised into four phases, divided into workpackages:
- Analytical Phase
- OCR alternatives - technology and product assessment;
- Analysis of catalogue cards;
- Installation of equipment and training of technical staff;
- OCR scanning of sample cards and conversion into ASCII;
- Analysis of results and develop-ment of error detection and correction methods.
- Specification Phase
- Error detection/correction and formatting specification;
- User interface specification;
- Detailed prototype planning.
- Production Phase
- Production of prototype, with programming and testing;
- Evaluation Phase
- Establishment of evaluation methods for cost-effectiveness;
- Full scale testing of prototype;
- Evaluation and finalisation;
- Dissemination of results.
- Key issues
- The main technical issues underlying the project related to:
- Diversity of source cards in terms of quality and content and the extent to which this can be accommodated;
- Quality of output format that can be obtained;
- Character sets, both for recognition and representation;
- Speed and cost-effectiveness of the approach.
- Impact and results
- The project furthered the automation of libraries in Europe by investigating prototype tools for fast and cheap mass conversion of catalogue cards into a machine-readable format suitable for use in library online catalogues and circulation systems. It also boosts recon expertise in Italy and Greece.
- The key result is a series of methodological tools: methods for formal analysis of catalogue cards; methods for formal analysis of typical errors; methods for the assessment of the quality, speed and cost of recon using OCR/ICR. The prototype software, with accompanying manual, is available as public domain software.
- Deliverables
- In addition to the prototype, reports have been consolidated into a series of publicly available Technical Reports, namely:
- OCR for retroconversion of catalogue cards (nº1);
- Framework for analysis of cards (nº2);
- Error analysis and correction in retroconversion (nº3);
- FACIT prototype: manual and documentation (nº4);
- Retroconversion of older card ctalogues using OCR and automatic formatting: project overview and Final Report (nº5)
- All project deliverables are available from the project web site: FACIT web site .
Coordinator
| Name of Institution/Organisation |
Postal Code / City |
Country |
| Biblioteksstyrelen (formerly Statens Bibliotekstjeneste) |
DK - 1051 COPENHAGEN - K |
DK |
| Title, First name Name |
Mrs Ulla Højsgaard |
Address: |
Nyhavn 31 E |
| Tel: |
+45-3373 33 73 |
Fax: |
+45-33 73 33 72 |
| E-mail 1: |
ulh@bs.dk |
E-mail 2: |
|
Other Partners
| Name of Institution/Organisation |
Country |
Role |
| Statsbiblioteket Århus |
DK |
P |
| Biblioteca Nazionale Centrale di Firenze |
IT |
P |
| Biblioteca Nazionale Vittorio Emanuele III, Napoli |
IT |
P |
| Ethnike Bibliotheke tes Hellados |
GR |
P |
| Det Kongelige Bibliotek |
DK |
A |
| Synergi I/S |
DK |
S |
- Definition of pilot experiment implementation plans.
|