The use of virtual assistants in the workplace is growing. By 2021, market research firm Gartner predicts, 25 % of workers will use a virtual assistant on a daily basis. The technology can help companies of all sizes – from start-ups, to small and medium-sized enterprises, to bigger businesses – take care of repetitive and time-consuming processes in meetings.
Taking human-machine communication to the next level
Intelligent virtual assistants are powered by natural language processing, a form of artificial intelligence that seeks to enable computers to understand human language. “Our mission is to revolutionise the interaction between humans and machines, making communication more natural and continuous,” notes Marta Casar, engineer at Verbio and coordinator of the EU-funded PAY-ME-ATTENTION project. “Verbio’s virtual assistant software is one of the most advanced systems on the market enabling continuous voice processing and nearly error-free real-time transcriptions.” Unlike other systems that often fail in real-world environments, Verbio’s system can instantly interpret the natural flow of conversation and distinguish user data from background noise and music. “Based on the latest artificial intelligence advances, we created innovative algorithms that can help build a robust speech recognition experience. We truly believe our algorithms will power every virtual assistant or speech recognition system worldwide, making them more accurate and less prone to security threats,” notes Casar.
Cutting-edge software modules
Verbio’s virtual assistant software comprises several modules that can either work together or be integrated as single modules into the voice recognition software. Speech input is registered through a circular array of six microphones that sends the audio stream to a storage server. A speech enhancement module improves speech quality by utilising different techniques: direction of arrival, beamforming, echo cancellation and noise removal, and automatic gain control. The system also utilises two other modules to improve speech processing: one that detects the presence or absence of human speech, and a blind source separation module that separates a set of source signals from the set of mixed signals stemming from the six microphones. Meanwhile, a speaker diarisation system breaks down the input audio stream into segments and groups them according to speaker identity. The output of this module is processed by a voice biometric system that verifies the speakers’ identities by pairing each diarised segment to its respective speaker. Conversation transcription is carried out by a continuous speech recognition system. Verbio’s software also utilises a natural language processing technique that automatically extracts meaning from the transcribed texts by identifying recurrent topics. “Our topic classification technique gives an instantaneous overview of all the topics discussed in the meeting. Furthermore, we have devised a technique to sort the topics by priority: the higher the number of words assigned to each topic, the higher its priority,” explains Casar. Ultimately, the summarisation module encodes and clusters sentences with similar meaning to display a shorter text. Virtual assistants in conference rooms will be the next big thing in the workplace. “Our integrated system can work in any meeting room environment. Sometime soon, virtual assistants could also improve productivity in virtual video conferences, and even do more than this: they could actually be running them,” concludes Casar.
PAY-ME-ATTENTION, virtual assistant, Verbio, natural language processing, business meeting, artificial intelligence, conversation transcription, speaker diarisation