Skip to main content

Getting at the Heart of Things: Towards Expressivity-aware Computer Systems in Music

Periodic Reporting for period 3 - Con Espressione (Getting at the Heart of Things: Towards Expressivity-aware Computer Systems in Music)

Reporting period: 2019-01-01 to 2020-06-30

"What makes music so important, what can make a musical performance or concert so special and stirring? It is the things the music expresses, the emotions it induces, the associations it evokes, the drama and characters it portrays. The sources of this expressivity are manifold: the music itself, its structure, orchestration, personal associations, social settings, but also -- and very importantly -- the act of performance, the interpretation and expressive intentions made explicit by the musicians through nuances in timing, dynamics etc.

Thanks to research in fields like Music Information Research (MIR), computers can do many useful things with music, from beat and rhythm detection to song identification and tracking. However, they are still far from grasping the essence of music: they cannot tell whether a performance expresses playfulness or ennui, solemnity or gaiety, determination or uncertainty; they cannot produce music with a desired expressive quality; they cannot interact with human musicians in a truly musical way, recognising and responding to the expressive intentions implied in their playing.

The project is about developing machines that are aware of certain dimensions of expressivity, specifically in the domain of (classical) music, where expressivity is both essential and -- at least as far as it relates to the act of performance -- can be traced back to well-defined and measurable parametric dimensions (such as timing, dynamics, articulation). We will develop systems that can recognise, characterise, search music by expressive aspects, generate, modify, and react to expressive qualities in music. To do so, we will (1) bring together the fields of AI, Machine Learning, Music Information Retrieval (MIR), and Music Performance Research; (2) integrate theories from Musicology to build more well-founded models of music understanding; (3) support model learning and validation with massive musical corpora of a size and quality unprecedented in computational music research.

The resulting computer technologies will be useful for a wide variety of purposes: more refined music search and recommendation systems; tools for automatic generation and adaptation of expressive music in multimedia domains; new musically 'sensitive' computer systems for interactive music making. A specific demonstrator we hope to be able to present at the end of the project is the ""Compassionate Music Companion"" -- a computer system that can accompany and interact with a human soloist in a musically natural and expressive way, recognising and anticipating the soloist's expressive playing intentions, and naturally adapting its playing style so as to match the expressive quality of the music, making for a natural musical interaction and experience.

Generally, with this research, we hope to contribute to a new generation of computer systems that can support musical services and interactions at a new level of quality, and to inspire expressivity-centered research in other domains of the arts and human-computer interaction (HCI)."
"For a computer to ""understand"" anything about what music may ""mean"" to us, and what it may express, it is fundamentally important for it to have an understanding of how music is structured, and how that structure is perceived by humans. Consequently, we have performed substantial research on computational algorithms for structure recognition (e.g. rhythm, harmony, recurring themes) in scores and in audio, and in temporal process models of music (e.g. the buildup and release of perceived harmonic tension). For this, the latest methods and developments from the field of ('Deep') Machine Learning were used. Some of the resulting algorithms were demonstrated to be the best world-wide (by achieving the top result in international scientific competitions); also, some have already been demonstrated experimentally to improve computer models of expressive music performance (see below).

A second central line of research is concerned with computational models of expressive music performance: computer programs that learn to associate expressive playing patterns (relating to tempo changes, timing, dynamics changes, articulation) in human performances to specific patterns found in the score (the sheet music) of the piece. These programs then can predict how a given musical passage should most likely be played expressively. A central result is the so-called ""Basis Function Model"", a comprehensive formal model of expressive performance, based on latest methods from machine learning (deep neural networks). One version of this model was reported to have passed a ""Musical Turing test"" [E. Schubert et al., J.New.Mus.Res. 46(2), 2017], producing a piano performance that was judged, by a large listening panel, to be at least as ""human"" as the performance of a professional concert pianist. For a popular presentation of this story, see .

Finally, we presented a first prototype of an interactive accompaniment system that follows and accompanies a human soloist, trying to adapt to human soloist's expressive playing style, and combining this with its own expressive performance strategies.
"In its first half, the project has led to progress beyond the scientific state of the art at several levels, and along several fronts. For example, we have advanced the state of the art in musical structure recognition algorithms (e.g. harmony: chords, keys), by combining deep learning with probabilistic modeling and statistical language models, achieving recognition rates better than any existing method. Our Basis Function Model of Expressive Music Performance is probably the best computational of its kind world-wide (based, e.g. on human ratings collected in the above-mentioned musical ""Turing test""). We have also shown how to improve expressive performance models by devising and integrating appropriate models of certain aspects of musical listening (e.g. via concepts such as musical tension/relaxation, and the formation of expectations in the listener during a piece). And finally, we have presented the first (though still very preliminary and limited) prototype of an interactive accompaniment system that reacts to expressive playing style and combines this with its own concept of expressivity in performance.

In the second part of the project period, these research lines -- machine learning models of structure perception in music, and machine learning models of expressive music performance -- will be combined and integrated with an entirely new approach to soloist following that we are currently developing (based on so-called ""reinforcement learning""), to make possible our final envisioned demonstrator system: the ""Compassionate Music Companion"", which will accompany and interact with a human soloist in a musically natural and expressive way, recognising and anticipating the soloist's expressive intentions, and seamlessly adapting its playing style so as to match the expressive quality of the music, making for a natural musical interaction and experience. Our ambition is to demonstrate this in a public live concert before the end of the project."
Man-Machine Collaboration in Expressive Performance: The Con Espressione! Exhibit
Autonomous Expressive Accompaniment: The ACCompanion