This project aims to specify and experiment with a common toolkit for the high-level interpretation of office documents and service network maps.
The ROCKI project, which stands for Raster to Object Conversion aided by Knowledge based Image processing, has resulted in the development of a generic set of tools for interpreting complex information displayed in printed form. Using the ROCKI software package, information can be extracted from individual documents or maps and stored in computer files for subsequent processing, including data enhancement.
In designing the software package, a modular concept, in which the interpretation proces is plit into a number of dicrete phases has been selected. After a primitive interpretation stage, in which elementary features such as characters, continuous and dotted lines, and circles are identified, more structure is introduced in the object interpretation stage, when emphasis is placed on the recognition of, for instance, words and paragraphs in the case of office documents, and houses or factories in the case of maps. In the third interpretation phase, basic features are grouped into more complex objects along with the associated semantics. By organizing the various identification procedures in this way and using knowledge about the structure of documents and maps to guide the interpretation process, a highly versatile software package which should have a wide field of application has been developed.
The main workpackages deal with the evaluation and further development of algorithms for image processing and analysis guided by knowledge of the document architecture, and the definition of a common information and interpretation model leading to the specification of a common architecture for document interpretation.
The project will proceed through the following steps:
- evaluate existing techniques and define requirements for further developments
- define information and interpretation model common to the application fields of office documents and maps
- produce initial design of toolkit architecture
- design and experiment with advanced application-specific techniques for the transformation of images into document descriptions.
5900 MA Venlo