Objective
An important research activity within Oce Group Research is the development of document analysis & processing algorithms. These algorithms can be used to develop products and services to manage automatically huge flows of documents. The research on meta-data extraction techniques for automatic indexing, is divided in the following stages:
1. Preprocessing. In this stage scanning artifacts are removed and the document image is positioned upright.
2. Layout analysis. In this stage characters, tables and figures are recognized. Characters are recognized using OCR techniques.
3. Genre classification. Information from earlier steps is used to classify the document as a book, report, and business letter, etc.
4. Logical analysis. In this stage the functional meaning of the individual text blocks is determined, as well as the logical structure and reading order of the document. The goal of this research is to automate the classification, archiving and retrieval of large collections of documents. The research must lead to more knowledge on the structure and meta-data of various kinds of documents. At a later stage new software products in the document management field will be developed, based on this research. The fellows will be part of the research group and will participate in creating new algorithms and strategies and in validating these algorithms. Building up new and extending existing contacts with other research groups in the field is also a part of the training.
Fields of science (EuroSciVoc)
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques.
You need to log in or register to use this function
Topic(s)
Data not availableCall for proposal
Data not availableFunding Scheme
BUR - Bursaries, grants, fellowshipsCoordinator
5900 MA VENLO
Netherlands