A mood-indexed, multimodal database is changing the game for interactive music systems
Introducing the MUSICAL-MOODS project, principal investigator Fabio Paolizzo and project supervisor Giovanni Costantini affirm: “With the support of the Marie Skłodowska-Curie programme, the project aimed at enabling capacity to classify and recognise emotions and mental states from multimedia data in interactive and intelligent music systems.” These systems can be used in various applications such as the profiling of users and databases for creative and media industries, improving access for citizens and researchers to musical heritage, services for audio on demand, education and training activities, music therapy and music making. The project envisions a future where such systems can draw analogies to solve complex tasks and computational music creativity supports a better understanding of ourselves. From this vision, the project worked towards the development of a multimodal database consisting of audio, video, motion capture and language data, realised with dancers engaging in dance improvisation, interactive music generation and interview sessions.
MUSICAL-MOODS database
“The MUSICAL-MOODS database was created with 12 professional dancers in a green screen environment equipped with a 30-camera Vicon motion capture system and the VIVO interactive music system,” confirms Paolizzo. He adds: “We focused the investigation on electroacoustic music that could induce anxiety in the dancers, along with other moods. We created more than 100 multimedia clips and 300 minutes of duration for each mode of acquisition, totalling over 1 TB of audio, video and motion capture data.” To do so, the project adopted multidisciplinary tools and methods from the sciences (cognitive sciences, human-computer interaction, machine learning, natural language processing and signal processing), arts (music, dance, motion capture and 3D animation), and humanities (musicology, history of music and philosophy). MUSICAL-MOODS has used the database in numerous artistic projects adopting interactive and/or intelligent multimedia systems and algorithms for audio and video signal processing. “From these experiences, we derived a model for mood classification of music and associated data, also leveraging on domain experts. We achieved solid significant results for mean classification accuracy (88 %) and strong improvements for root mean square error in comparison to the state of the art,” highlights Costantini. A multimodal game with a purpose (M-GWAP) for internet users was deployed to design and improve the classification algorithms. The M-GWAP targets emotions and mental states that can be induced in the users and expressed through music and associated multimedia data. The game leverages the wisdom-of-the-crowds approach in order to generate annotations through fun and cost-effective user interactions.
Next steps
“M-GWAP will be used for modelling language data deriving from the dancers’ interviews that were realised as part of the project,” reports Paolizzo. Language modelling will help assess the dancers’ mental states and emotions at the time immediately following performance acquisitions. Synchronisation already ongoing on the dataset will allow investigating how emotions and mental states can be induced or expressed across audio, video and motion capture. This will help better understanding of how the different media influence the induction and expression of mood and what their temporal cross-reference and valence are. For example, a melody could induce us to feel a certain way, while a consequent lyric in the same music might change the meaning of that mood radically.
Keywords
MUSICAL-MOODS, interactive, emotions, mental states, multimedia data, intelligent music system