We propose a novel approach to speech recognition, using new tools from stochastic analysis (the theory of rough paths) and machine learning theory (diffusion maps).
The goals of this research are:
(i) to develop new algorithms for speech recognition,
(ii) t o advance our understanding of the mathematical tools to be used for this purpose and
(iii) through this research, to create the conditions for the smooth re-integration of the researcher in the European mathematical community.
We model the speech recognition process as a multi-scale dynamical system. The lowest scale consists of the acoustic signal and its delays driving a distribution on the set of phonemes, which in turn drives a distribution on the set of words and so on.
We are mainly interested in the lowest scale. According to the theory of rough paths, all the information should be contained in the first p iterated integrals, where p is the "roughness" of the signal. The first problem is how to estimate p from a discrete sample of the signal. One way is to look at the rate of decay of the iterated integrals. Another way is to treat the signal as a discrete signal and look for the q for which p-variation becomes "negligible". By considering the first p iterated integrals, we have embedded the signal in a much bigger space. Note though that we are only interested in a particular response, namely the distribution on the phonemes. We need to find those components that contain this information.
To do this, we use a database of speech signals for which this response is known. Using a metric on the responses, we define a "kernel on similarity" on the samples, which we use to construct the diffusion map. These can be extended to all speech signals and be used to define a distance compatible with the known responses. The above methodology can be generalized to any case where we need to find those characteristics of a rough signal that cause a particular type of response.
Field of science
- /natural sciences/mathematics/applied mathematics/dynamical systems
- /humanities/languages and literature/linguistics/phonetics
Call for proposal
See other projects for this call