Skip to main content
European Commission logo print header

Meta-data extraction techniques for automatic indexing.

Cel

An important research activity within Oce Group Research is the development of document analysis & processing algorithms. These algorithms can be used to develop products and services to manage automatically huge flows of documents. The research on meta-data extraction techniques for automatic indexing, is divided in the following stages:
1. Preprocessing. In this stage scanning artifacts are removed and the document image is positioned upright.
2. Layout analysis. In this stage characters, tables and figures are recognized. Characters are recognized using OCR techniques.
3. Genre classification. Information from earlier steps is used to classify the document as a book, report, and business letter, etc.
4. Logical analysis. In this stage the functional meaning of the individual text blocks is determined, as well as the logical structure and reading order of the document. The goal of this research is to automate the classification, archiving and retrieval of large collections of documents. The research must lead to more knowledge on the structure and meta-data of various kinds of documents. At a later stage new software products in the document management field will be developed, based on this research. The fellows will be part of the research group and will participate in creating new algorithms and strategies and in validating these algorithms. Building up new and extending existing contacts with other research groups in the field is also a part of the training.

Temat(-y)

Data not available

Zaproszenie do składania wniosków

Data not available

Koordynator

OCE TECHNOLOGIES B.V.
Wkład UE
Brak danych
Adres
43,St. Urbanusweg 43
5900 MA VENLO
Niderlandy

Zobacz na mapie

Koszt całkowity
Brak danych