Skip to main content

An integrated approach to understanding the impact of de novo mutations on the mammalian genome

Periodic Reporting for period 2 - DENOVOMUT (An integrated approach to understanding the impact of de novo mutations on the mammalian genome)

Reporting period: 2018-07-01 to 2019-12-31

The ERC Advanced Grant project addresses two broad issues:

1. What is the influence of new mutations on reproductive traits, other quantitative traits and genomic variation in mammals?
2. What explains variation in the amount of nucleotide diversity across the genome?

This is potentially important for society because:

1. Understanding the nature of genetic variation is important for understanding the genetic basis of human complex disease.
2. It is argued that the fitness of human populations is at long term risk from deleterious mutation accumulation.
3. The input of variation from mutations has implications for understanding continued responses to artificial selection in farm animals and crops and the nature of long term response to artificial selection.

The overall objectives of the project are:

1. To determine the rate at which reproductive ability and other quantitative traits (such as growth rate) change as a consequence of the accumulation of new mutations in the house mouse.
2. To determine the properties of new mutations affecting the mouse genome, including the rate at which mutations appear per generation, and the extent to which the properties of mutations vary among different strains.
3. Based on our results from mice, predict the rate of change of traits related to reproductive ability in human,s and the impact of new mutations on response to artificial selection in farm animals.
4. Explain why nucleotide diversity varies across the mammalian genome and to determine the relative contributions of mutations in coding and noncoding DNA to fitness change.
5. To infer the distribution of fitness effects of new mutations.
1. New genetic variation arising from spontaneous mutations using the house mouse as a model mammalian species. We have set up a highly replicated mutation accumulation (MA) experiments in the house mouse. Using four inbred strains, we started experiments with single mating couples of each strain, and have built up numbers each generation to produce 75 MA lines maintained by brother-sister mating. Quantitative traits, including growth and viability, for each mouse in the experiment are being recorded. As a control to allow us in future to determine whether the mean values of quantitative traits have changed, we have frozen embryos. We have sequenced the founding individuals of the four sets of MA lines using Illumina technology and are in the process of characterizing the genetic variation present in these lines.

2. Distinguishing low from high frequency variant sites in the genome. One way to detect adaptive evolution occurring in the genome is to determine if there are sites that have alleles segregating at high frequencies which are in the process of being brought to fixation by positive selection. For a collection of sites, the numbers of sites present at different frequencies is known as the unfolded site frequency spectrum (uSFS). We have developed a statistical approach to estimate the uSFS, based on polymorphism data from a sample of individuals for a population, that can be used to estimate the frequency of adaptive evolution in the genome. This new method has been released to the scientific community as a software package.

3. Understanding the causes of variation in nucleotide diversity across the genome. In mammalian species, including mice and humans, the amount of genetic variation varies across the genome. In particular, there are troughs in average genetic diversity close to protein-coding genes and gene regulatory elements. Two processes are believed to be capable of generating these troughs in diversity: selection against deleterious variants (causing background selection) and selection in favour of advantageous variants (causing selective sweeps), both of which reduce variation at linked sites. We have been attempting to quantify the relative contributions of background selection and selective sweeps to average diversity dips around protein-coding genes and gene regulatory elements in the mouse genome based on the frequency distribution of alleles in the affected regions. We have concluded that selection in favour of strongly advantageous mutations has been important in shaping patterns of nucleotide diversity across the genome.

4. The distribution of fitness effects (DFE) for new mutations. The relative frequencies of mutations with different effects sizes is of fundamental importance for many questions in evolutionary biology, including the nature of variation for quantitative traits. We have been attempting to characterize properties of the DFE in the single-celled green alga Chlamydomonas reinhardtii by crossing lines carrying known complements of spontaneous mutations with their unmutated ancestral strains. Crossing causes the mutations to segregate, and thereby generates recombinant lines containing the mutations segregating in many different combinations, whose fitness can be measured. We have developed a new statistical approach that jointly analyses the complement of mutations carried by each line and their fitnesses to infer properties of the distribution of fitness effects of individual mutation. This suggests that the distribution is L-shaped, and that a surprisingly high proportion of mutations increase fitness in the laboratory environment.
1. We are making the first attempt to carry out a highly replicated mutation accumulation experiment with a cryopreserved control that will allow us to determine the rate of change of fitness per generation in a model mammalian species. We will also determine the rate of increase of genetic variation from mutation for quantitative traits.

2. We will sequence a large cohort of individuals from MA lines of four different strains, determine the extent of variation in the rate of mutation between different inbred mouse strains, and will determine the factors that influence variation in the mutation rate across the genome.

3. We will use long-range sequencing technology to infer the rate of large scale rearrangements and transposition events in the genome.

4. Our study on the causes of diversity variation across the mouse genome attempts to estimate the relative contributions of mutations in coding and noncoding elements of the genome to fitness change.

4. We will use changes of frequency of mutations segregating in large population of Chlamydomonas to estimate the distribution of fitness effects mutations under natural selection.