Innovative Techniques for Recognition and Processing of Documents

Objective

The objective of INTREPID is to develop new techniques for recognising and processing documents, demonstrate them in a development environment, and integrate them into an advanced application for the automatic classification of office documents. The INTREPID project is linked to ROCKI (project 5376). The planned recognition system must cope with a mixture of texts, line graphics, headings and grey-scale images, and with a variety of character sizes, styles and print qualities. Advanced distributed computer hardware will be used so that the increased requirements of new recognition algorithms and strategies can be satisfied.

INTREPID is aiming to:

- Develop new advanced preprocessing and character classification strategies and their implementation, together with existing approaches, in order to process poor-quality documents more successfully.
- Improve reading results by advanced post-processing functions incorporating document layout analysis and linguistic-based approaches. The results of the ROCKI project on decomposing documents into different regions of interest will be taken into account.
- Employ algorithmic procedure and recognition strategies, which can be particularly effective when supported by an appropriate hardware/software architecture. In order to show this, suitable algorithms will be chosen, implemented, tested and modified on a distributed parallel hardware architecture.
- Demonstrate the results in suitable development environments (PC, workstation or dedicated hardware) and in an application specifically developed for the automatic classification of office documents.

The main workpackages can be grouped into four categories:

From Preprocessing to Postprocessing

- working out strategies, procedures and algorithms in the field of preprocessing andclassification, suitable for supporting the recognition of poor-quality office documents
- developing structural algorithms for text recognition and line graphic analysis
- analysing the format and layout of office documents.

Linguistic Contextual Postprocessing

- investigating basic algorithms for lexical, grammatical and semantic analysers
- their application to a number of European languages (English, Italian, Spanish).

Hardware and Software Architecture Definition and Prototype Implementation

- defining a distributed parallel computer architecture, based on state-of-the-art technology, best suited for the recognition procedures on an appropriate prototype hardware platform.

Specific Application

- automatic classification of office documents.

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

FP2-ESPRIT 2 - European strategic programme (EEC) for research and development in information technologies (ESPRIT), 1987-1992

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Data not available

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Data not available

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Data not available

Coordinator

AEG Olympia AG

EU contribution

No data

Address

Bücklerstraße 1-5
78467 Konstanz
Germany

Total cost

No data

Participants (7)

CIENCIA Y TECNOLOGIA APLICADA

Spain

EU contribution

No data

Address

ROGER DE LLURIA, 50
08009 BARCELONA

Total cost

No data

EWH KOBLENZ

Germany

EU contribution

No data

Address

RHEINAU 3-4
56075 KOBLENZ

Total cost

No data

Ingegneria C Olivetti and Co SpA

Italy

EU contribution

No data

Address

Via G Jervis
10015 Ivrea

Total cost

No data

Nottingham Trent University

United Kingdom

EU contribution

No data

Address

Burton Street
NG1 4BU Nottingham

Total cost

No data

Pacer Systems

United Kingdom

EU contribution

No data

Address

6 Robin Hood Industrial Estate
NG3 1GE Nottingham

Total cost

No data

UNIVERSITA DEGLI STUDI DI NAPOLI FEDERICO II

Italy

EU contribution

No data

Address

Via Claudio 21
80125 NAPOLI

Total cost

No data

Università degli Studi di Bari

Italy

EU contribution

No data

Address

Via Garruba 6/B
70121 Bari

Total cost

No data

Objective

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (7)

Share this page Share this page on social networks

Download Download the content of the page

Innovative Techniques for Recognition and Processing of Documents

Objective

Fields of science (EuroSciVoc) CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s) Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s) Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.

Coordinator

Participants (7)

Share this page Share this page on social networks

Download Download the content of the page

Fields of science (EuroSciVoc)

CORDIS classifies projects with EuroSciVoc, a multilingual taxonomy of fields of science, through a semi-automatic process based on NLP techniques. See: The European Science Vocabulary.

Programme(s)

Multi-annual funding programmes that define the EU’s priorities for research and innovation.

Topic(s)

Calls for proposals are divided into topics. A topic defines a specific subject or area for which applicants can submit proposals. The description of a topic comprises its specific scope and the expected impact of the funded project.

Call for proposal

Procedure for inviting applicants to submit project proposals, with the aim of receiving EU funding.

Funding Scheme

Funding scheme (or “Type of Action”) inside a programme with common features. It specifies: the scope of what is funded; the reimbursement rate; specific evaluation criteria to qualify for funding; and the use of simplified forms of costs like lump sums.