Periodic Reporting for period 1 - HA-HA (LAUGHTER IN CONVERSATION: ENHANCING THE NATURALNESS OF DIALOGUE SYSTEMS)
Reporting period: 2018-11-01 to 2020-10-31
The second part of the project saw work done on the investigation of features for laughter discrimination from speech and the development of an automatic method for laughter detection. As previous laughter studies have shown that humans rely on rhythm information for the perception of laughter, we examined two rhythm representations based on the modulation of the speech signal. The analysis revealed that the two representations, encoding the variation of the envelope of the signal and its temporal fine structure, respectively, may discriminate between laughter (laughs and speech-laughs) and speech. We then used this information to develop an automated method for laughter detection which receives in input the speech signal, computes its modulation spectrum and determines, based on this representation, the time intervals where laughter may occur. This automated method was then integrated in a semi-automatic laughter annotation procedure, by limiting the manual annotation procedure to the time intervals returned by the automatic tool. An evaluation of the proposed semi-automatic procedure showed that it decreased, on average, the time required for annotation by 30%, while keeping a similar inter-rater reliability as the fully manual procedure, at a cost of 10% missed laughter events.
An online perceptual experiment was designed for the third part of the project, in which the participants watched a conversation between a real-estate agent and a client while visiting an apartment. The speech belonging to the dialogue system-based agent were synthesized using a state-of-the-art system and we manipulated the presence of social laughter in the real-estate agent's voice. The participants were asked to judge the interaction of the agent with the client, as well as to evaluate the agent based on several dimensions, such as professionalism or pleasantness. We also considered a control condition, in order to test whether the participants show the same differences for both a virtual and a human agent, between the no-laughter and the laughter-enhanced conditions.
The results of the project were disseminated in the form of five published proceedings articles in highly relevant conferences and workshops, one accepted conference abstract, as well as one submitted journal article, and one article in preparation. Moreover, we organized the sixth edition of the Workshop on Laughter and Other Non-Verbal Vocalisations, thus further increasing the visibility of the project in the scientific community. The scope and the activities of the project have also been communicated through non-scientific actions, such as interviews for the university's blog and the MSCA Fellow of the Week programme. In terms of exploitation of results, we agreed on a collaboration with colleagues from the private sector for testing our measures of speech entrainment in a business setting, with the possible development of a tool for the automatic analysis of business relationships. Finally, the developed semi-automatic laughter annotation tool was made available online, for any interested parties.