Skip to main content

Rule-algebraic Simple Rewriting

Periodic Reporting for period 1 - RaSiR (Rule-algebraic Simple Rewriting)

Reporting period: 2017-10-01 to 2019-09-30

One of the most successful modern approaches to the study of complex biochemical reaction systems is the so-called rule-based modeling approach (via the Kappa or BioNetGen frameworks). These approaches intrinsically rely upon a sophisticated concept from theoretical computer science known as rewriting theory. The abstraction of molecules to so-called agents and of reactions to so-called rewriting rules not only permits to efficiently encode empirical information on biochemical reaction systems, but in particular permits to implement high-performance simulation algorithms as a source of “in silico” empirical data.

The central aim of the RaSiR project has been to improve upon the existing theory of rewriting systems in order to permit the development of fundamentally new approaches to algorithm design in bioinformatics. Crucially, the simulations provided by the existing rule-based modeling platforms alone are not providing sufficient information in order to understand dynamical and functional behaviors of biological systems, since the core sources of these behaviors are given by pathways and their interactions rather than individual realizations of the systems. Despite the long history of rewriting theory with over 40 years of developments, we identified certain key aspects of the theory that had previously not been considered or understood. In particular, through a close analogy to the theory of combinatorics, re-focusing the analysis of rule-based systems upon sequential compositions of rules (rather than on sequential rewriting steps) revealed a fruitful new type of mathematical structure: sequential compositions of rules have a certain associativity property, which permits to encode the non-determinism in rule compositions within a mathematical structure of so-called rule algebras.

At a fundamental level, we addressed the question of how to extrapolate from customized formulations of rewriting theories to a universal framework, accessible also to practitioners outside the bioinformatics communities. Developing this general framework permitted us to discover interesting novel application areas of rewriting theories including the stochastic dynamics of social networks, random graph models and pattern counting problems in combinatorics. Our work was motivated by the possibility of achieving a deeper understanding of the origin of functional behavior of biological systems and of their pathway dynamics, with potential applications including the discovery of potential drug targets, as well as the discovery of a variant of statistical mechanics tailor-made for the study of stochastic network models and random graphs.
At the start of the project, while a prototype for a rule algebra for the special case of graph-rewriting rules had been developed (N. Behr et al. 2016), the conceptual distance to the sophisticated versions of rewriting theories utilized in bioinformatics posed a considerable obstacle for further progress. In joint work with J. Krivine at IRIF (Université Paris Diderot), we conducted a program of fundamental research to overcome this obstacle. Taking inspiration from the experiences and contribution of the core team of Kappa developers (P. Boutillier, V. Danos, J. Feret, J. Krivine), we managed to identify a concrete set of requirements to implement for our general theory. These requirements include the possibility to specify constraints both on the underlying data strictures (such as e.g. prohibiting double-bonds between two reactive sites) as well as on rewriting rules (controlling their applicability). For example, the moniker of the RaSiR project was chosen to reflect one of the typical scenarios, i.e. the rewriting of simple graphs (graphs without multiple edges), and many practical application scenarios of rewriting theories such as the Kappa approach are equipped with their specific sets of constraints that are motivated from the underlying systems’ properties.
In an intricate technical development spanning the course of the entire project, we were able to meet all of the requirements for such “simple” rule-algebraic rewriting theories in the form of a universal approach based upon so-called adhesive categories. The mathematical concept of adhesivity, originally introduced in the work of Lack and Sobocinski in 2003, provides a minimal prerequisite for the formulation of rewriting theories, in that it encodes properties that ensure the existence of subobject relations. Endowing this setup with the theory of constraints from traditional rewriting theory, we were able to develop the mathematical structure of rule algebras and their representation theory for these “simple” rewriting theories.

Currently, extending beyond the end of the funding period, we continue our joint work together with an extended team in order to address the problem of implementing algorithms based upon the tracelet framework. We aim to extend the open-source KAPPA platform for biochemical modeling as well as the MØD platform for analysis of organo-chemical reaction systems with modules that permit to perform tracelet-based pathway dynamics analysis tasks, as well as statically generating pathway data directly from the specification of the reaction systems.