Skip to main content

BIg Speech data analytics for cONtact centres

Deliverables

bigBison

The final prototype - an open environment, allowing to be used in stand-alone configuration or in integration with third-party infrastructure (Result of all WP6 Tasks). The system will demonstrate full capabilities of the technology. All speech technologies will come in noise-robust and optimized versions, transcription and keyword spotting will be available all languages covering the needs of end-user in the consortium (see also Table 3), and methodologies for rapid and cost-effective development of new ones (leveraging on customer data) will be provided. The system will be fully integrated with one large CC hardware and software infrastructure and generation of real business outputs will be demonstrated on real data.

smallBison

Initial prototype demonstrating the BISON technologies. A software system contain a full range of speech mining technologies (speech recognition, speaker verification, language identification and voice activity detection) in 9 languages (see Table 3) and a simple presentation of results. Although working on off-line data set and with rudimentary UI, it will be deployed with the CCs in the project to gather initial user feedback. It is based on intermediate results of T6.2 through T6.6.

Legal, ethical and societal issues of BISON - The BISON ethical and societal code

Starting from the outcome of the previous deliverables (D8.[234]) and with the support of the feedback by project partners during the development and deployment of BISON, D8.5 will set the rules and procedures for BISON as concerns ethics. It will be addressed both to BISON partners and to CCs, while a dedicated schematic section will be addressed to the wider public for awareness building and information. The deliverable will also provide ethical approvals for the planned collection and analyses of personal data, if updates or new approvals as compared to the approvals submitted in M3 are needed.

Optimizing speech data mining for CC operation

A progress report on advancing speech data mining for the dynamic CC environment. Will include notes on scalability and real-time (T4.3), fast bootstrapping of recognizers for new languages (T4.4), and component evaluations (T4.6).

Indexing and database access to big speech data

software for fast database access to speech and mined data.

Initial speech mining technologies

A set of SW consolidating existing or slightly adapted speech data miners to provide fast start of the project. Mainly based on the results of T4.1 and T4.2, includes the results of component evaluation T4.6.

Final set of speech technologies adapted for Contact Centers

Software modules and associated report describing the final version of CC-adapted speech mining technologies, including innovation during BISON lifetime. Includes the results of T4.5 and all preceding Tasks.

Public web-page

the main public contact point to the project. It will be complemented by Linkedin and Facebook pages.

Searching for OpenAIRE data...

Publications

Three ways to adapt a CTS recognizer to unseen reverberated speech in BUT system for the ASpIRE challenge

Author(s): KARAFIÁT Martin, GRÉZL František, BURGET Lukáš, SZŐKE Igor and ČERNOCKÝ Jan
Published in: Proceedings of Interspeech, 2015, Page(s) 2454-2458, ISSN 1990-9772

Voiceprint transformation for migration between automatic speaker identification systems .

Author(s): GLEMBEK Ondřej, MATĚJKA Pavel, BURGET Lukáš, SCHWARZ Petr, PEŠÁN Jan and PLCHOT Oldřich
Published in: A bstract book of the 7th European Academy of Forensic Science Conference, 2015

Effect of gender and call duration on customer satisfaction in call center big data

Author(s): Llimona, Quim / Luque, Jordi / Anguera, Xavier / Hidalgo, Zoraida / Park, Souneil / Oliver, Nuria
Published in: Proc. INTERSPEECH 2015, 2015, Page(s) 1825-1829, ISSN 1990-9772

Using voice quality measurements with prosodic and spectral features for speaker diarization

Author(s): Woubie, Abraham / Luque, Jordi / Hernando, Javier
Published in: Proc. Interspeech 2015, 2015, Page(s) 3100-3104, ISSN 1990-9772

Residual memory networks: Feed-forward approach to learn long-term temporal dependencies

Author(s): Murali Karthick Baskar, Martin Karafiat, Lukas Burget, Karel Vesely, Frantisek Grezl, Jan Cernocky
Published in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, Page(s) 4810-4814
DOI: 10.1109/ICASSP.2017.7953070

Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks

Author(s): Karel Beneš, Murali Karthick Baskar, Lukáš Burget
Published in: Interspeech 2017, 2017, Page(s) 284-288
DOI: 10.21437/Interspeech.2017-1442

2016 BUT Babel System: Multilingual BLSTM Acoustic Model with i-Vector Based Adaptation

Author(s): Martin Karafiát, Murali Karthick Baskar, Pavel Matějka, Karel Veselý, František Grézl, Lukáš Burget, Jan Černocký
Published in: Interspeech 2017, 2017, Page(s) 719-723
DOI: 10.21437/Interspeech.2017-1775

Analysis of Score Normalization in Multilingual Speaker Recognition

Author(s): Pavel Matějka, Ondřej Novotný, Oldřich Plchot, Lukáš Burget, Mireia Diez Sánchez, Jan Černocký
Published in: Interspeech 2017, 2017, Page(s) 1567-1571
DOI: 10.21437/Interspeech.2017-803

Bayesian phonotactic Language Model for Acoustic Unit Discovery

Author(s): Lucas Ondel, Lukas Burget, Jan Cernocky, Santosh Kesiraju
Published in: 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017, Page(s) 5750-5754
DOI: 10.1109/ICASSP.2017.7953258

Analysis and Description of ABC Submission to NIST SRE 2016

Author(s): Oldřich Plchot, Pavel Matějka, Anna Silnova, Ondřej Novotný, Mireia Diez Sánchez, Johan Rohdin, Ondřej Glembek, Niko Brümmer, Albert Swart, Jesús Jorrín-Prieto, Paola García, Luis Buera, Patrick Kenny, Jahangir Alam, Gautam Bhattacharya
Published in: Interspeech 2017, 2017, Page(s) 1348-1352
DOI: 10.21437/Interspeech.2017-1498

Alternative Approaches to Neural Network Based Speaker Verification

Author(s): Anna Silnova, Lukáš Burget, Jan Černocký
Published in: Interspeech 2017, 2017, Page(s) 1572-1575
DOI: 10.21437/Interspeech.2017-1062

MGB-3 but system: Low-resource ASR on Egyptian YouTube data

Author(s): Karel Vesely, Baskar Karthick Murali, Mireia Diez, Karel Benes
Published in: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2017, Page(s) 368-373
DOI: 10.1109/ASRU.2017.8268959

Semi-Supervised DNN Training with Word Selection for ASR

Author(s): Karel Veselý, Lukáš Burget, Jan Černocký
Published in: Interspeech 2017, 2017, Page(s) 3687-3691
DOI: 10.21437/Interspeech.2017-1385

ABC NIST SRE 2016 SYSTEM DESCRIPTION

Author(s): BRUMMER Niko, SWART Albert du Preez, PRIETO Jesús J., GARCIA Perera Leibny Paola, MATĚJKA Pavel, PLCHOT Oldřich, DIEZ Sánchez Mireia, SILNOVA Anna, JIANG Xiaowei, NOVOTNÝ Ondřej, ROHDIN Johan A., GLEMBEK Ondřej, GRÉZL František, BURGET Lukáš, ONDEL Lucas, PEŠÁN Jan, ČERNOCKÝ Jan, KENNY Patrick, ALAM Jahangir, BHATTACHARYA Gautam and ZEINALI Hossein et al.
Published in: Proceedings of the NIST SRE Workshop, 2016

Sequence Summarizing Neural Networks for Spoken Language Recognition

Author(s): Jan Pešán, Lukáš Burget, Jan Černocký
Published in: Interspeech 2016, 2016, Page(s) 3285-3288
DOI: 10.21437/Interspeech.2016-764

Analysis of DNN approaches to speaker identification

Author(s): Pavel Matejka, Ondrej Glembek, Ondrej Novotny, Oldrich Plchot, Frantisek Grezl, Lukas Burget, Jan Honza Cernocky
Published in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, Page(s) 5100-5104
DOI: 10.1109/ICASSP.2016.7472649

Improving i-Vector and PLDA Based Speaker Clustering with Long-Term Features

Author(s): Abraham Woubie, Jordi Luque, Javier Hernando
Published in: Interspeech 2016, 2016, Page(s) 372-376
DOI: 10.21437/Interspeech.2016-339

Audio enhancing with DNN autoencoder for speaker recognition

Author(s): Oldrich Plchot, Lukas Burget, Hagai Aronowitz, Pavel Matejka
Published in: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, Page(s) 5090-5094
DOI: 10.1109/ICASSP.2016.7472647

Analysis of the DNN-Based SRE Systems in Multi-language Conditions

Author(s): NOVOTNÝ Ondřej, MATĚJKA Pavel, GLEMBEK Ondřej, PLCHOT Oldřich, GRÉZL František, BURGET Lukáš and ČERNOCKÝ Jan
Published in: Proceedings of the 2016 IEEE Workshop on Spoken Language Technology (SLT 2016), 2016, Page(s) 199-204

Short- and Long-Term Speech Features for Hybrid HMM-i-Vector based Speaker Diarization System

Author(s): Abraham Woubie Zewoudie, Jordi Luque, Javier Hernando
Published in: Odyssey 2016, 2016, Page(s) 400-406
DOI: 10.21437/Odyssey.2016-58

HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification

Author(s): Hossein Zeinali, Hossein Sameti, Lukas Burget
Published in: IEEE/ACM Transactions on Audio, Speech, and Language Processing, Issue 25/7, 2017, Page(s) 1421-1435, ISSN 2329-9290
DOI: 10.1109/TASLP.2017.2694708

Text-dependent speaker verification based on i-vectors, Neural Networks and Hidden Markov Models

Author(s): Hossein Zeinali, Hossein Sameti, Lukáš Burget, Jan “Honza” Černocký
Published in: Computer Speech & Language, Issue 46, 2017, Page(s) 53-71, ISSN 0885-2308
DOI: 10.1016/j.csl.2017.04.005

Variational Inference for Acoustic Unit Discovery

Author(s): Lucas Ondel, Lukaš Burget, Jan Černocký
Published in: Procedia Computer Science, Issue 81, 2016, Page(s) 80-86, ISSN 1877-0509
DOI: 10.1016/j.procs.2016.04.033

Study of Large Data Resources for Multilingual Training and System Porting

Author(s): František Grézl, Ekaterina Egorova, Martin Karafiát
Published in: Procedia Computer Science, Issue 81, 2016, Page(s) 15-22, ISSN 1877-0509
DOI: 10.1016/j.procs.2016.04.024

Bottle-Neck Feature Extraction Structures for Multilingual Training and Porting

Author(s): František Grézl, Martin Karafiát
Published in: Procedia Computer Science, Issue 81, 2016, Page(s) 144-151, ISSN 1877-0509
DOI: 10.1016/j.procs.2016.04.042

Semi-Supervised Training of Language Model on Spanish Conversational Telephone Speech Data

Author(s): Ekaterina Egorova, Jordi Luque Serrano
Published in: Procedia Computer Science, Issue 81, 2016, Page(s) 114-120, ISSN 1877-0509
DOI: 10.1016/j.procs.2016.04.038

Automatic Speech Feature Learning for Continuous Prediction of Customer Satisfaction in Contact Center Phone Calls

Author(s): Carlos Segura, Daniel Balcells, Martí Umbert, Javier Arias, Jordi Luque
Published in: Advances in Speech and Language Technologies for Iberian Languages, 2016, Page(s) 255-265
DOI: 10.1007/978-3-319-49169-1_25

Privacy Through Anonymisation in Large-Scale Socio-Technical Systems: Multi-lingual Contact Centres Across the EU

Author(s): Claudia Cevenini, Enrico Denti, Andrea Omicini, Italo Cerno
Published in: Internet Science, 2016, Page(s) 291-305
DOI: 10.1007/978-3-319-45982-0_25