Skip to main content

Automatic Sentiment Estimation in the Wild

Periodic Reporting for period 3 - SEWA (Automatic Sentiment Estimation in the Wild)

Reporting period: 2017-08-01 to 2018-07-31

The overall aim of the SEWA project is to enable computational models for machine analysis of facial, vocal, and verbal behaviour in the wild.
The SEWA project already resulted in multiple benefits and is expected to continue growing, as technologies that can robustly and accurately analyse human facial, vocal and verbal behaviour and interactions in the wild, as observed by webcams in digital devices, would have profound impact on both basic sciences and the industrial sector.
• They could open up tremendous potential to measure behaviour indicators that heretofore resisted measurement because they were too subtle or fleeting to be measured by the human eye and ear. SEWA technology was used in a semi-automatic manner to track the temporal dynamics of smiles in elderly people and flag that behavioural biomarker of depression (Channel 4 “Old People’s Home for 4 Year Old”, 2017, 2018), which then became a basis for the UK EPSRC grant proposal AIDD: “AI- empowered identification of Depression and Dementia” in which three SEWA partners are involved (Prof. Pantic’s group, Prof. Schuller’s group, and RealEyes).
• Such technologies would also effectively lead to development of the next generation of efficient, seamless and user-centric human-computer interaction (affective multimodal interfaces, interactive multi-party games, and online services). This was recognised by Samsung Electronics, who visited SEWA Coordinator in January 2018 and decided to open a new Samsung AI Research Centre (SAIC) in Cambridge with the main aim of developing human-centric AI. SEWA Coordinator became the Research Director of SAIC in Cambridge in May 2018. SAIC Cambridge is now in negotiation with three SEWA beneficiaries to invest in common projects (Imperial College London, University of Augsburg, and RealEyes).
• Such technologies would have profound impact on business. For example, automatic market research analysis would become possible, which has been successfully showcased by SEWA partner RealEyes.
• Such technologies could also enable next generation healthcare technologies (e.g. remote monitoring of conditions like pain, anxiety and depression). As already said above, SEWA technology became a basis for the UK EPSRC grant proposal AIDD: “AI-empowered identification of Depression and Dementia” in which three SEWA partners are involved.
WP1 – SEWA DB collection, annotation and release
Annotated valence and arousal in all recordings of the subjects watching the 4th stimulus clip and in all full video-chat recordings. Released the SEWA database version 1.0 publicly as according to the data management plan.
WP2 - Low-level Feature Extraction
Application of the state-of-the-art of linguistic features employed in text retrieval to the sentiment analysis.Investigation of acoustic landmarks as robust linguistic features for emotion recognition. Implementation of incremental in-the-wild face alignment method for automatic facial landmark localisation.
WP3 – Mid-level feature extraction
Development of copula based model for the intensity estimation of action units. Annotated 100 sequences from the SEWA data for AU detection. Implementation of the deep convolutional model for mid-level feature extraction.
WP4 – Continuous Affect and Sentiment Sensing in the Wild
Investigation of confidence measure-based Semi-Supervised Learning (SSL) for multimodal emotion recognition. Deep Neural Network (DNN)-based Multi-Task Learning using the uncertainty of the labels (disagreement between annotators) as an additional target. Enhancement of the Automatic Speech Recognition (ASR) module for the languages German and English.
WP5 – Behaviour Similarity in the Wild
Development, implementation, and evaluation of a novel methodology for unsupervised temporal segmentation of behaviour based on multimodal data. Development of a novel framework novel framework for dynamic behaviour modelling, analysis, and prediction.
WP6 – Temporal Behaviour-Patterning and Interpersonal Sentiment in the Wild
An audiovisual fusion method based on cross-prediction of each modality has been modified. An approach based on low-order linear dynamical systems has been developed. Experiments have been conducted on the SEWA database for behaviour prediction of valence, arousal and liking in-the-wild, facial, vocal and audio-visual behaviour similarity estimation and (semi)-unsupervised behaviour understanding in-the-wild.
WP7 – Integration, Applications and Evaluation
SEWA Audio and Video tools are tested and integrated to processing pipeline to have more behaviour input for the emotional profiles. Evaluation of the SEWA tools. Exploration of commercial opportunities through multiple meetings with a variety of potential partners.
WP8 – Dissemination, Ethics, Communication and Exploitaion in Part B.
WP9 – Project management
Overall strategic and operational management and steering of the project, ensuring the accuracy, quality and timeliness of deliverables.Management of liaison with the European Commission; management of public face of the project and networking with other related projects.Co-ordination of coherence of all developments between WPs.
WP1: We released the SEWA database (SEWA DB), a multilingual dataset of annotated facial, vocal and verbal behaviour recordings made in-the-wild. SEWA DB will be used for a number of challenges and benchmarking efforts and will have more than 200 active users worldwide by the end of the project. The SEWA DB can be accessed online at
WP2: Development of a hybrid system combining BoAW (acoustic features) and BoW (Bag-of-Words, linguistic features) and also BoVW (Bag-of-Visual-Words) with different feature fusing schemes. The toolbox openXBOW has been released and has already been used on different tasks.s.
WP3: We have presented the robust mid-level visual feature detection component developed for the SEWA project
WP4 : Realisation of a fully automatic continuously-valued sentiment and affect dimensions predictor from audio-visual data recorded in the wild, which gets performance competitive or better than other state-of-the-art approaches. Definition of meaningful confidence measures for regression problems.
WP5 : Experiments have been conducted on the SEWA database for (i) behaviour prediction of valence, arousal and liking in-the-wild as well as (ii) facial, vocal and audio-visual behaviour similarity estimation for behaviour template discovery and, in general, (semi)-unsupervised behaviour understanding in-the-wild. Experiments on the naturalistic data of the SEWA Database demonstrate the robustness and the effectiveness of the proposed framework. The predictive framework is shown to be capable of predicting future labels of behaviour even when given a small amount of past observations, characterized by (possibly) corrupted and/or non-informative annotations. The similarity estimation framework is shown to be capable of discovering representative templates of affective behaviour.
WP6 : We have also developed a facial, vocal and audio-visual behaviour similarity measurement framework which can be used (i) to find typical templates of affective behaviour and (ii) classify never-before-seen sequences as being similar or dissimilar to the identified behaviour templates.
WP7 : In partB
WP8 : SEWA partners have increased the interest of general public and the industry in the field.