Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Exploring Musical Possibilities via Machine Simulation

Periodic Reporting for period 1 - Whither Music? (Exploring Musical Possibilities via Machine Simulation)

Reporting period: 2022-01-01 to 2023-06-30

"Whither Music?" was the motto of Leonard Bernstein's 1973 Norton Lectures at Harvard ("The Unanswered Question"), where he analysed the musical developments that led to what he called the 20th Century Crisis of Music: the gradual decline of tonality, driven by a takeover of tonal ambiguity in the late 19th and early 20th centuries, eventually leading to complete abandonment of tonality in Schönberg's dodecaphony - a historical process that Bernstein portrays as equally inevitable and problematic.

WHITHER MUSIC? is a project that aims to establish model-based computer simulation (via methods of AI, (deep) Machine Learning and probabilistic modelling) as a viable methodology for asking questions about musical processes, developments, possibilities and alternatives - for music research, for didactic purposes, for creative music exploration scenarios. Computer simulation here means the design of predictive or generative computational models of music (of certain styles), learned from large corpora, and their purposeful and skilful application to answer, e.g. "what if" questions, make testable predictions, or generate musical material for further analysis, musicological or aesthetic. We believe that this would open new possibilities for music research, education, and for creative engagement with music, some of which will be further explored in the project.

This vision of purposeful application of computational models dictates the central methodological principles for our research: veridical modeling and simulation require stylistically faithful, tightly controllable, transparent, and explainable models. These requirements, in turn, motivate us to develop and pursue a musically informed approach to computational modeling, as an alternative to the currently prevailing trend of end-to-end learning with huge, opaque neural networks. The cornerstones of our approach will be structured modeling (rather than end-to-end learning), multi-level and multi-scale modeling and structural projection (rather than note-by-note prediction), and exploiting musical knowledge (rather than purely data-driven inductive learning) at all levels - including the design of appropriately informed model architectures and loss functions.

In terms of modeling domains, we will be concerned with three types of computational models: models of music generation, of expressive performance, and of musical expectancy, mirroring the three major components in the system of music: the composer, the performer, and the listener. In addition to developing fundamental machine learning and modeling methods, we will explore concrete simulation and application scenarios for our computer models, in the form of musicological studies, creative and didactic tools and exhibits, and public educational events, in cooperation with musicologists, music educators, and institutions from the creative arts and sciences sector.

At a fundamental level, the goal of his project is thus really two-fold: beyond developing the technology for, and demonstrating, controlled musical simulation for serious purposes, we wish to develop and propagate an alternative approach to AI-based music modeling, hoping to contribute to a re-orientation of the field of Music Information Research (MIR) towards more musically informed modeling - a mission we already started in our previous ERC project Con Espressione.
Project Period 1 (Jan 2022 - June 2023):
In the first project period, the project has already progressed along all major fronts described above. In terms of *fundamental new technologies and methods* of a general nature, we developed new methods for obtaining musically relevant and interpretable explanations from deel learning models, as well as strategies for critically testing the plausibility of such explanations; we looked into new ways of learning effective general representations for music and audio tasks from data, and combined these with complexity reduction methods; and we are developing a new approach to musical source separation based on so-called "differentiable dictionaries".

In terms of musical *modeling domains*, we started addressing all three domains mentioned above: (1) music perception and prediction ("the listener"), (2) music performance and interaction ("the performer"), and (3) music generation ("the composer"). Regarding (1), we developed a general way of modeling musical pieces as graphs, which now makes it possible to address all sorts of music perception and structure recognition tasks with graph neural networks while offering a very natural structured representation for music and scores; our first published demonstrations and results concern tasks like musical voice separation, cadence detection, and harmonic structure analysis. Also, we took first steps towards probabilistic musical expectancy models based on differentiable short-term models. Regarding (2), our central research and demonstration object is the ACCompanion, an autonomous piano accompaniment and expressive co-performance system; a paper describing this system and analysing its core components was presented at the IJCAI 2023 conference. Regarding (3), we extended the notion of deep diffusion models to discrete representations, opening up new avenues for controllable music generation at a symbolic level (again, this will presented at the IJCAI 2023 conference).

Moreover, the project also develops and openly distributes *research resources*, in the form of open source datasets and software tools, for the research community. In the first reporting period, this includes a note-level-aligned version of the large ASAP piano performance dataset; a first version of the Partitura software library for musical score and alignment handling, and the general match file format for note-level alignments between performances and musical scores.

A detailed list of project publications, structured along these lines, can be found at https://www.jku.at/en/institute-of-computational-perception/research/projects/whither-music/publications/ .

We continually engange in dissemination activities that address both scientific audiences (such as keynotes at the IJCAI-ECAI 2022 conference) and also more general audiences (e.g. a public presentation ("Matinee") organised by the Heidelberg Laureate Forum Foundation). In June 2023, we started our own video channel on YouTube, which offers a live video stream focusing on specific experiments directly from our music research lab, once a week (https://www.youtube.com/@paowcpjku).
The first project period (Jan 2022 - June 2023) has already produced a substantial number of new technical results, regarding new, structured representations for musical models (e.g. graph-based), new ways of obtaining musically relevant explanations from deep models, and new machine-learning-based models of music recognition, performance, and generation processes. That these go beyond the state of the art is documented by corresponding publications in some of the top scientific media of our fields, such as the IJCAI conferences, or the Transactions of the International Society for Music Information Research (TISMIR). We are confident that by continuing along these lines and taking advantage of new developments in fields like AI and machine learning, we will be able to contribute to a re-orientation of the field of Music Information Research (MIR) towards more structured and efficient modeling of music and musical processes.
Measuring and tracking hand movements during piano performance/practice