Community Research and Development Information Service - CORDIS

H2020

BISON Report Summary

Project ID: 645323
Funded under: H2020-EU.2.1.1.4.

Periodic Reporting for period 2 - BISON (BIg Speech data analytics for cONtact centres)

Reporting period: 2016-01-01 to 2016-12-31

Summary of the context and overall objectives of the project

According to the latest European Contact Center Benchmark data, the European Contact Center (CC) industry involves more than 35,500 contact centers with 3.8 Million jobs in 30 countries (a median reported size is 81 positions). The sectors keeps growing structurally at an annual pace of 3,6% in employment. A usual CC operation generates a wealth of spoken data. A typical contact center with 1,000 agents, each doing 40 calls a day with an average call lasting for 3 minutes, generates 2,000 hours of audio every 24 hours. This data is the core of CC’s business; however, its current exploitation is rather limited.
The objective of BISON is to create a multi-lingual, modular and highly versatile software system for big speech data analytics in contact centers targeting:
basic speech data mining technologies
transforming the basic data into information valuable for business strategies
real-deployment of the systems by real CCs
An important aspect of BISON is the relation between technology and law – privacy and legal aspects are considered as a solid grounding for what the technology should and should not be allowed to do. The technology is also considered as a valuable tool to check and enforce data privacy in a sensitive CC environment.

Work performed from the beginning of the project to the end of the period covered by the report and main results achieved so far

On the level of basic speech data mining technologies, we have summarized the state of the art in laboratory and production techniques for speech data mining (including transcription, keyword spotting, speaker recognition and language identification) and prepared a clear scheme how to incorporate and motivate contact center (CC) users towards improvements of these technologies. This scheme is already paying off in BISON - the automatic speech recognition (ASR) engines are trained on a mix of commercially available and collected CC data, are reaching good performances on a set of languages in the target CC domain; some of those have reached the production-grade and are either integrated or ready for integration. The consortium has also prepared plans for the integration of speaker identification in order to provide versatile speaker identification technology, useful to verify client identity, identify returning clients, and help to prevent fraud. R&D in all speech data mining modalities progresses and is gradually adopted for the production and integration.

For transforming the basic data into information valuable for business strategies, we have gathered user requirements, prioritized them and determined the level of technical and legal feasibility. At the implementation level, we have determined sets of keywords pertaining to different scenarios, and for some, produced versions in multiple relevant languages, and developed the first version of tracking the CC call-flow and adherence to script, that are very important quality indicators. Modules for call taxonomy detection and user satisfaction were developed and tested. To support the business outcome mining, CC partners have collected sets of recordings (in addition to recordings aiming at speech technology development) in order to examine opportunities emerging in calls and to find patterns leading to success in a relationship with customer. This work also includes design of the re-configurable dashboard that will serve as the interface between the CC technology and its users.

On the level of real-deployment of the systems by real CCs, we have designed the architecture of a simple CC call data mining solution, integrated with CC call handling infrastructure, and defined the necessary APIs. At the end of the first project year, the first demonstrator - smallBison - was set up and during 2016, it was extensively updated and tested by the CC partners. The consortium is now analysing their feedback and preparing for the final integration round in 2017, resulting in the bigBison demonstrator, that will include real-time call handling capabilities, allowing not only for post-analysis, but for actions at the time the call is running.
Data is crucial for Bison development - in the project lifetime, the BISON consortium has collected, anonymized and manually transcribed almost 250 hours real CC data, that has been made available to the consortium. In addition, non-transcribed data converted into anonymized features, business outcome data and demonstration “fake” data (that, on contrary to real customer data, can be demoed and passed to third parties) have been defined, and in majority also collected and processed.

Legal and ethical matters are not on the side but in the center of project activities - we have continuously studied the legal and ethical context of CC personal data processing, with focus on speech data. This work also already brought fruits in BISON: it helped to define guidelines for BISON data collection and processing, and recommendations thereof (for example anonymization of calls) have been implemented into project demonstrators. The intellectual property and contractual issues within and outside of the BISON consortium are being continuously solved, hand in hand with the development of business strategies. Finally, work has started on legal and ethical recommendations for exploitation of BISON results, reflecting also the changes in legislation (GDPR). This way, the technical development can proceed in a fully law-abiding way.

Progress beyond the state of the art and expected potential impact (including the socio-economic impact and the wider societal implications of the project so far)

In 2015, the most important progress beyond state-of-the-art were:
starting direct involvement of user in the functionality of recognition systems, by providing target annotated data and running selected adaptation steps.
proposing solution for analysis of 100% of recorded calls, countering the current scheme of only partial assessment of CC by supervisors.
starting assessing the call flow, directly related to agent’s quality and productivity.
development of a demonstrator showcasing the integration of speech data mining with a CC infrastructure.
laying solid legal and ethical grounds for data processing, data transfers and technology development within the CCs, also in consideration of the upcoming framework set by the EU General Data Protection Regulation.

In 2016, the most significant progress beyond state-of-the-art were:
Integration and deployment of the first project demonstrator - smallBison - on real data and collection of user feedback.
Integration of data collected in BISON to the training of speech transcription engines, directly showing the benefit of CC data collection.
Preparation of framework for further data collection - anonymized features, business outcome data, public data and acted data ready for demonstration.
Integration of speech transcription and preparation for real-time based technologies
Work on business outcome mining, including customer satisfaction analysis and call taxonomy.
Proposing a re-configurable Dashboard tool for a comprehensive overview of speech data mining results.
Study of impact of new European personal data legislation (GDPR) on CC speech data mining technologies and drawing recommendations for further development in BISON.

Related information

Follow us on: RSS Facebook Twitter YouTube Managed by the EU Publications Office Top