Skip to main content
Ir a la página de inicio de la Comisión Europea (se abrirá en una nueva ventana)
español español
CORDIS - Resultados de investigaciones de la UE
CORDIS
Contenido archivado el 2024-04-15

Advanced Algorithms and Architectures for Speech and Image Processing

Objetivo

The objective of SIP was to develop the algorithmic and architectural techniques required for recognising and understanding spoken and visual signals, and to demonstrate these techniques by means of suitable applications.
The work was planned in three parallel areas: speech analysis, image analysis and pattern recognition and understanding.
With respect to speech, the initial application target was to extend as far as possible current state-of-the-art techniques for speech recognition. The resulting system was to be tested using a vocabulary of the order of 1,000 words with constrained syntax and using continuous speech.
For image processing, the project attempted to go beyond treating the image merely as sampled data. Applications involved in medical imagery and industrial inspections were used to test the tools and to study architectural and implementation issues. At the higher level of processing, close commonality can be expected between techniques for speech and image processing. Subsequent work will study architectures suitable for the higher levels, which can interface with the lower level systems.
Algorithms and prototype equipment are available for recognition of continuous speaker dependent speech and for understanding phrases within restricted semantic domain, and also for image feature extraction and recognition. Applications are in fields such as medicine, robotics and telecommunications.
A prototype was made available on 29/03/89
The operating environment is as follows :
Hardware: special hardware coupled with Symbolics

The objective of speech and image processing (SIP) was to develop the algorithmic and architectural techniques required for recognising and understanding spoken and visual signals and to demonstrate these techniques by means of suitable applications. The work was planned in 3 parallel areas: speech analysis, image analysis and pattern recognition and understanding.
Progress on speech processing was made along 2 complementary lines: a statistical approach and a knowledge based approach. Preliminary results were obtained from the statistical approach, based on a first implementation, using very large lexicons. For the knowledge base approach, a methodology for representation of the lexical and acoustical knowledge was chosen. In addition, the architecture of the acoustical front end was realized and the first digital signal processing boards tested.
A coordinated set of algorithms and architectures for image recognition and understanding was developed and demonstrated. Layer approaches based on single instruction multiple data (SIMD) and multiple instruction multiple data (MIMD) machine were realized for image feature extraction.
Implementation aspects of the physical architecture for high level processing based on transputers fully interconnected through a switching network were analyzed in detail. A switching element for nonlocal communication was designed outside the project, and the first building-block, comprising 2 processing elements and a hardware emulation of the interconnection network, is now available.
Progress on speech processing was made along two complementary lines: a statistical approach and a knowledge-based approach. Preliminary results were obtained from the statistical approach, based on a first implementation, using very large lexicons. For the knowledge-based approach, a methodology for representation of the lexical and acoustical knowledge was chosen. In addition, the architecture of the acoustical front-end was realised and the first digital signal processing boards tested. The lexical access and the verification based on a Hidden Markov Model were demonstrated on a VAX machine on a set of short sentences uttered by a single speaker in a noisy environment. Methods to incorporate syntactic and semantic information were studied to achieve understanding of uttered sentences. A small question-answering system running on a Symbolics machine was demonstrated. The system starts from the word lattice produced by the speech system, builds a representation of the query-using syntax and semantics, and inally answers the query.
A coordinated set of algorithms and architectures for image recognition and understanding was developed and demonstrated. Layer approaches based on Single Instruction Multiple Data (SIMD) and Multiple Instruction Multiple Data (MIMD) machine were realised for image feature extraction. A heterogeneous approach was taken, linking a SIMD GAPP array for the low-level processing and an MIMD transputer-based machine or an array processor for the medium-level processing. The interfaces and the I/O of the data we re developed and optimised. Estimates of performance were derived from a set of algorithms running on the different parts of the architecture. This was improved by setting up real benchmarks. Specific work was done to provide a coordinated set of algorithmic tools for digital angiography applications.
Implementation aspects of the physical architecture for high-level processing based on transputers fully interconnected through a switching network were analysed in detail; a switching element for non-local communication was designed outside the project, and the first building-block, comprising two processing elements and a hardware emulation of the interconnection network, is now available.
PIPES, the first prototype realisation of a Prolog transputer-based machine where the transputers are fully interconnected using a packet-switched network, was demonstrated. It will be implemented on the high-level architecture for speech and image understanding and applied to real-time tasks.
Exploitation
SIP has been the source of applications in sound, vision and robotics through the development of a coordinated set of algorithms and architectures for image recognition and understanding. It provides the foundation for applications in medicine, in industry and in other domains. Project results also support the development of intelligent workstations to support both graphic and image processing.
The successful combination of statistical techniques and knowledge-based techniques for speech recognition will result in a major breakthrough in the field. The complete real-time stand-alone system displaying spoken Italian which is now under developmentwill be adapted for French and German.

Programa(s)

Programas de financiación plurianuales que definen las prioridades de la UE en materia de investigación e innovación.

Tema(s)

Las convocatorias de propuestas se dividen en temas. Un tema define una materia o área específica para la que los solicitantes pueden presentar propuestas. La descripción de un tema comprende su alcance específico y la repercusión prevista del proyecto financiado.

Datos no disponibles

Convocatoria de propuestas

Procedimiento para invitar a los solicitantes a presentar propuestas de proyectos con el objetivo de obtener financiación de la UE.

Datos no disponibles

Régimen de financiación

Régimen de financiación (o «Tipo de acción») dentro de un programa con características comunes. Especifica: el alcance de lo que se financia; el porcentaje de reembolso; los criterios específicos de evaluación para optar a la financiación; y el uso de formas simplificadas de costes como los importes a tanto alzado.

Datos no disponibles

Coordinador

Centro Studi e Laboratori Telecomunicazioni SpA
Aportación de la UE
Sin datos
Coste total

Los costes totales en que ha incurrido esta organización para participar en el proyecto, incluidos los costes directos e indirectos. Este importe es un subconjunto del presupuesto total del proyecto.

Sin datos

Participantes (4)

Mi folleto 0 0