Skip to main content
European Commission logo print header

Robust SPEAKER DIariazation systems using Bayesian inferenCE and deep learning methods

Cel

The proposed project deals with Speaker Diarization (SD) which is commonly defined as the task of answering the question “who spoke when?” in a speech recording. The first objective of the proposal is to optimize the Bayesian approach to SD, which has shown to be promising for the tasks. For Variational Bayes (VB) inference, that is very sensitive to initialization, we will develop new fast ways of obtaining a good starting point. We will also explore alternative inference methods, such as collapsed VB or collapsed Gibbs Sampling, and investigate into alternative priors similar to those introduced for Bayesian speaker recognition models.

The second part of the proposal is motivated by the huge performance gains that, in recent years, have been brought to other recognition tasks by Deep Neural Networks (DNNs). In the context of SD, DNNs have been used in the computation of i-vectors, but their potential was never explored for other stages of SD. We will study ways of integrating DNNs in the different stages of SD systems.

The objectives of the proposal will be achieved by theoretical work, implementation, and careful testing on real speech data. The outcomes of the project are intended not only for scientific publications, but eagerly awaited by European speech data mining industry (for example Czech Phonexia or Spanish Agnitio).

The project is proposed by an excellent female researcher, Dr. Mireia Diez, having finished her thesis in the GTTS group of University of the Basque Country, one of the most important European labs dealing with speaker recognition and diarization. The proposed host is the Speech@FIT group of Brno University of Technology, with a 20-year track of top speech data mining research. The proposed research training and combination of skills of Dr. Diez and the host institution have chances to advance the state-of-the-art in speaker diarization, provide the applicant with improved career opportunities and benefit European industry.

System finansowania

MSCA-IF-EF-ST - Standard EF

Koordynator

VYSOKE UCENI TECHNICKE V BRNE
Wkład UE netto
€ 142 720,80
Adres
ANTONINSKA 548/1
601 90 Brno Stred
Czechy

Zobacz na mapie

Region
Česko Jihovýchod Jihomoravský kraj
Rodzaj działalności
Higher or Secondary Education Establishments
Linki
Koszt całkowity
€ 142 720,80