Genetically Evolving Models of Science

Our society depends on scientific progress, which is reliant on the development of scientific theories and models. However, the development of scientific models suffers from two related problems: the ever-growing number of experimental results and scientists’ cognitive limitations (including cognitive biases). This multidisciplinary project (psychology, computer modelling, computer science and cognitive neuroscience) addresses these problems by developing a novel methodology for generating scientific models automatically. The methodology is not specific to any particular discipline and can be applied to any science where experimental data are available.

The method treats models as computer programs and evolves a population of models using genetic programming. The extent to which the models fit the empirical data is used as a fitness function. The best models – potentially modified by cross-over and mutation – are selected for the next generation. Pilot simulations have established the validity of the methodology with simple experiments.

To demonstrate that the methodology is sound, can be used with complex experiments and can be generalised across sciences, four related strands of research are planned. First, ‘Building New Tools’ refines the methodology and creates techniques to understand and compare the evolved models. Second, ‘Explaining Human Data’ uses the methodology to explain a wide range of data on human cognition. This will be done in two steps: (a) data without learning (attention and working memory); and (b) data with learning (categorisation, implicit learning and explicit learning). Third, ‘Explaining Animal Data’ develops models to account for various aspects of animal behaviour, focusing on conditioning and categorisation. Finally, ‘Explaining Neuroscience Data’ extends the methodology to account for data combining information about cognitive and brain processes.

In Strand 1 of the project (Building new tools: Programming of the GEMS software and related tools), good progress has been made with developing algorithms that simplify the generated models and with developing the programming environment.

Simplification is important, as it makes models both more understandable and more amenable to further manipulation by algorithms. We have written a review of the literature on simplification with genetic programming (GP) from an explainability point of view, focusing on how simplification can make GP models more explainable by reducing their sizes. In addition, we have developed two techniques for simplifying the generated models. In the first technique, a less fit parent individual is replaced with a child of better fitness, at the generation level. The second technique uses the same idea, but at the individual level.

We have created a “phased evolution” method, where the components of the fitness function are introduced incrementally. We have also developed several post-processing techniques: dead code removal by tracing the operation of each evolved program and recording those parts of the program which are not executed; removal of time-only code where operators not affecting performance are replaced by a WAIT operator; measure of pair-wise similarity of the programs, allowing clustering and graphical depiction; and (semi-)automatic generation of pseudo-code, making models more readable.

In Strand 2 (Explaining human data), the research has focussed on simulating attention experiments, and the GEMS software has been used to develop and interpret models in the delayed-match-to-sample, spatial cueing and selective attention tasks. Given the expertise of one of the postdoctoral fellows in value-based decision making, the GEMS methodology has also been used to develop models in this experimental domain. We have also worked on simulating experiments from the working-memory and verbal learning literatures; the latter introduces explicit learning to the GEMS environment. Finally, we have written a review of computational scientific discovery in psychology and its implications, and started working on the ethical implications of the GEMS project.

Strand 3 (Explaining animal data) has begun at the end of 2021, with the PhD student starting his research.

The project develops algorithms that semi-automatically generate scientific models in psychology and neuroscience -- this goes beyond the state of the art. Progress has been made in developing the programming environment to generate models, developing algorithms simplifying the models that are generated, and applying these programming tools to specific experiments in attention, working memory and decision making. New models have been generated for these experiments. As noted in the proposal, the expected results until the end of the project are to (a) develop further algorithms (e.g. to search the space of parameters of the evolved models); (b) generate new models in other parts of experimental psychology (e.g. learning and concept formation); (c) to generate models in animal experimental psychology; and (d) to generate models on neuroscience data.

Periodic Reporting for period 2 - GEMS (Genetically Evolving Models of Science)

Related documents

Share this page

Download