Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
Content archived on 2024-06-18

Developing a global understanding of the PRC and NuRD complexes in stem cell differentiation, in health and disease

Final Report Summary - 4DCELLFATE (Developing a global understanding of the PRC and NuRD complexes in stem cell differentiation, in health and disease)

Executive Summary:
How does an undifferentiated cell "decide" which type of cell it will become? We are addressing a portion of this question by elucidating the role in cell fate taken by the epigenetic regulator complexes Nucleosome Remodelling and Deacetylation (NuRD) complex and the Polycomb Repressive Complexes (PRCs). These complexes have been previously shown to be critical in cell fate decisions made by embryonic stem (ES) cells. The basic mechanism of action for these complexes is well-established: they modulate the accessibility to the nucleosome by interacting directly with the histone proteins. This in turn dictates not only the level of chromatin compaction, but also the further interactions of transcription factors and other regulators with the DNA sequences. However, a highly interesting aspect of these complexes that is just now becoming clear is that multiple combinations of paralogous subunits generate considerable functional diversity among these complexes. This combinatorial assembly is also further complicated by their dynamic composition. Several publications (including some from this project) had shown that this dynamic composition can determine which gene networks these complexes regulate, and the type of regulation that they induce.

Interestingly, a "reverse" type of differentiation can occur in cancer cells, in which a differentiated cell reverts to a "cancer" stem cell, as frequently occurs in leukemia. We would therefore also have intended to determine to what extent NuRD and PRC1/2 play a role in this reversion, and whether they could be manipulated to force such cancer stem cells to differentiate. This biochemical and structural work was also considered to be critical also for the molecular biologists in their decisions about how to approach analyzing the dynamics of PRC and NuRD functions and mechanisms in vivo and in cell culture. In Subproject 1 (WP1–3, Figure 1) we examined the roles of the specific subtypes of complexes, to distinguish between their potential roles in cell differentiation (using ES cells). We also studied whether the distinct PTMs modulate these decisions. This could have occurred by changing the preferences of the complexes in their protein-protein, protein-nucleosome, or protein-nucleic acids interactions. Such interactions could have also been regulatory; for instance, interactions with long non-coding RNAs could regulate the complexes, and we investigated this aspect.
In Subproject 2 (WP4–6, Figure 1), we were analyzing the NuRD and PRC complexes by mass spectrometry to determine the exact composition of these. This allowed us to begin mass-producing the complexes in vitro to carry out detailed structural analyses, using X-ray crystallography, NMR, and electron microscopy. In parallel, we were determining the network of post-translational modifications (PTMs) that each component of the complexes is likely to undergo by mass spectrometry.
In Subproject 3 (WP8, Figure 1), data integration tools have been developed that are available to the consortium. Further, the different types of data generated by the project were being analyzed by a systems biology approach to elucidate details about how the expression, structure, interactions, and activity of the NuRD and PRC complexes varies with time during stem cell differentiation.
Finally, we were working on determining how alterations in PRC/NuRD complex activities contribute to disease and, importantly, whether it is possible to manipulate the function through specific small molecule inhibitors (Subproject 4, WP8–9, Figure 1). We aimed not to block the activity of all possible NuRD and PRC1/2 complex variations, but to interfere specifically with those that have been proven to be responsible for the cancer stem cell phenotype, or to modulate those important for a specific differentiation lineage. Such manipulation opens a plethora of opportunities for cancer treatments.

Project Context and Objectives:
Polycomb repressive complex & NuRD Interactome
The aim of Sub-project 1 (SP1) was to understand the complex networks in which the NuRD/PRC proteins are involved, the way they interact with the genome, and the way they control changes in gene expression as ES cells differentiate. We use proteomics experiments to understand the complexes formed by NuRD/PRC proteins (WP1), and chromatin-immunoprecipitation experiments linked to high-throughput DNA sequencing (ChIP-seq) to understand the way they interact with the genome (WP2). This analysis was then correlated with genome-wide RNA-seq experiments to understand how these complexes control gene expression (WP2). Finally, we initiated several high-throughput genetic screens to identify new components and/or regulators of these complexes (WP3). The large amount of results generated during these 5 years indicate that variations in the assembly of both Polycomb and NuRD complexes are finely regulated to proper balance pluripotency and stem cell differentiation. Moreover, the chemical compounds we have identified will be of great use not only for further studies but also as potential drugs to modulated Polycomb and NuRD activities in vivo.

Polycomb repressive complex & NuRD Structural Biology
The objective of Sub-project 2 (SP2) was to generate structural knowledge of the different complexes and their components. This has been a daunting challenge due to the flexibility, size, and heterogeneity of the complexes involved. A highly collaborative and integrative approach, combining cross-linking/Mass Spectrometry (MS), Electron Microscopy (EM), Small Angle X-Ray Scattering (SAXS), X-ray crystallography, and other biochemical/biophysical approaches has been applied to generate a comprehensive 3D understanding of how PRC/NuRD complexes assemble. Our studies in SP2 have been greatly facilitated by the very fruitful collaboration with colleagues working in SP1 (mainly WP1 and WP2) to study both PRC and NuRD proteomics and PRC- and NuRD-genome interactions using ChIP-seq experiments. The structures we have determined, and are in the process of generating, will allow us to understand how PRC and NuRD complexes functions to alter chromatin structure. They are also providing the basis for the design of small molecules that can interfere with NuRD function.

Polycomb repressive complex & NuRD Data Integration
Global changes in gene expression heterogeneity are associated with cell fate potential and particular functional gene classes tend to be enriched for high or low noise levels. However, each gene’s underlying promoter architecture, local chromatin state and its broader regulatory environment are likely to be major determinants of these differences in the mode of transcription. The aim of this sub-project was to combine diverse data types generated in this project to provide an integrated view of how the PRC and NuRD complexes function to regulate cellular differentiation. The ultimate objective was to construct a predictive model that relates PRC and NuRD diversity, activity, and localisation to changes in gene expression and the differentiation process. We combined recently published single-cell RNA-seq data from mouse ESCs and cortex with matched publicly available chromatin state datasets to obtain an unbiased overview of features involved in the control of transcriptional consistency. Our results suggest that histone modifications directly influence transcriptional variability within tissues. However, an important caveat is the potential existence of cell-to-cell variability in chromatin status, which is invisible in data from the population-based ChIP-seq assays used in this analysis. This model was evaluated and refined in an iterative process of integration, prediction, and experimental validation.
Further, as part of the subproject the data generated by the consortium were made available to all consortium members and a user-friendly software tool was developed to facilitate data integration and analysis by wet-lab biologists. Generated datasets and software tools were made available to the wider scientific community.

Polycomb repressive complex & NuRD In Disease
The aim of this subproject was to better understand the human NuRD and PRC complexes, and to use the information gleaned from the rest of the project to understand and manipulate NuRD- or PRC-dependent processes in human cells. Specifically, we sought to use our knowledge of PRC complexes to identify small molecules that would impact the properties of tumour cells, and do capitalise on our knowledge of NuRD biology to improve differentiation protocols for human pluripotent cells. The work in SP4 has benefited greatly from advances made in the other subprojects. SP1 provided important information about complex biochemistry and genome-wide localisations, SP2 helped us to understand the structure of these complexes, and SP3 was important in integrating all of the data together into a coherent picture. The inhibitors and protocols discovered through the work carried out in this subproject provide important ground work which will ultimately lead to improved therapeutic applications.

Project Results:
1.1. SUBPROJECT 1: Polycomb repressive complex & NuRD Interactome

Proteomics over time
BAC-GFP constructs have been obtained for a number of NuRD and PRC complex subunits:
For Polycomb: Asxl2, BAP1, EED, Ezh2, CBX8, CBX7, Ring1, C17orf96 (new PRC2 interactor)
For NuRD: MBD3, MBD2, CDKAP1 (DOC-1), MTA1, MTA2, MTA3, Zmynd8, Znf296, Znf219 (new NuRD interactors)
Many of these constructs have been transfected into mouse embryonic stem cells and stable cell lines have been made. GFP-based single-affinity purifications combined with label-free quantitative mass spectrometry-based proteomics (AP-MS) experiments are performed to characterize the protein complexes for all the tagged subunits, including their stoichiometry
To study the dynamics of Polycomb and NuRD protein complexes during stem cell differentiation, we make use of established protocols to differentiate mouse ES cells into neuronal precursor cells (NPCs) and finally into astrocytes or neurons. During the differentiation process, the BAC-GFP cell lines maintain nearly endogenous expression of the tagged Polycomb or NuRD complex subunits. Nuclear extracts are generated from these different cell types and AP-MS experiments are then performed to characterize the protein complexes. These experiments are yielding very interesting results. In the previous reporting period, we presented the purification and characterization of PRC1, PRC2 and NuRD in ES cells and NPCs, which revealed that the composition of these complexes is highly dynamic during cellular differentiation. During this past year, we also purified the PR-DUB complex from ESCs and NPCs (Figure 2).
This complex also shows a few dynamic subunits/interactors during differentiation, including Mbd5, which has a higher abundance in the NPC complex. However, the core of the PR-DUB complex remains stable during cellular differentiation.

We also performed ChIP-seq analyses for PRC1, PRC2, and NuRD to determine the differences in genome-wide binding sites for these complexes in mouse ES cells and neural progenitor cells (see below).
We have also generated tagged lines for both known and novel NuRD subunits/interactors. A transgenic BAC mouse ES line was created for the novel NuRD-interacting protein Zfp219. We also have a mouse ES line containing stably integrated GFP-Zfp296, another novel NuRD interacting protein. Label-free GFP pull-downs of these cell lines confirmed their interaction with NuRD. Interestingly, Zfp296 turns out to be a shared subunit between the NuRD and Sin3/Hdac complex in mouse ES cells (Figure 3). In parallel we also profile genome-wide binding targets for NuRD in ESCs and NPCs using ChIP-sequencing (see below).

The Bartke lab (YIP3) has generated recombinant nucleosomes marked with H3K27me3. Together with the Bartke lab, we have performed pulldowns utilizing these recombinant nucleosomes together with lysates from mouse ES and NP cells.
As a positive control, we observed that the PRC2 complex binds to H3K27me3 in both the mESC and NPC pulldowns, specifically the Suz12 and Ezh2 subunits. Interestingly, subunits from many chromatin modifying complexes were repelled by the H3K27me3 modification in mESCs. These include SWI/SNF family (Smarcc1, Smarcd2), MLL family (Ash2l, Kmt2d), HBO1 (Kat7), and PRC1 (Mga).
We also observed Top2a, a DNA topoisomerase, as a protein repelled by H3K27me3 in both mESCs and NPCs. In the NPC pulldowns, we identified the lysine demethylase Kdm5b as a prominent reader of the H3K27me3 modification (Figure 4).

Integration of proteomics with genome-wide analysis and single molecule fluorescence microscopy
We have analysis of Polycomb subunits in proliferating and differentiating ES cells. Among those, we have characterized the genome-wide occupancy of Cbx2, Cbx6, PHC1, and Pcgf2/Mel18, which now can be integrated with our previous analyses of Suz12, Ring1b, Pcgf4/Bmi, Rybp, Cbx2, Cbx4, Cbx7, and Cbx8 (Figure 5). Additionally, we have also performed ChIP-seq analysis for several NuRD components, including MBD3, Mta1, Mta2, and Chd4.

We have also determined that Mel18 interacts with Ring1B in early cardiac-mesoderm precursors cells (MES) and measured the expression levels of all cPRC1 and ncPRC1 subunits (Figure 6). Cbx7 and Phc1 were strongly downregulated in MES cells, with Cbx2 and Phc2 the main Cbx and Phc family members expressed in these cells. Mel18 and Pcgf6 were the strongest Pcgf proteins expressed in MES cells. Strikingly, while global H2AK119ub levels were constant in ESCs as compared to MES cells, H3K27me3 levels were reduced, concomitant with an accumulation of H3K27ac levels. Concordant with the effects of Mel18 depletion in ESCs, the levels of H2AK119ub and H3K27me3 also remained unaltered in shMel18 MES cells. Western blot analysis of several PRC1 subunits confirmed that Cbx2 and Phc2 were strongly upregulated, and that Mel18 was marginally downregulated in MES cells (Figure 6).

These data suggested that a new cPRC1 complex containing Cbx2 and Phc2 was assembled in MES cells, and that Mel18 was required to stabilize cPRC1 but not (as in ESCs) ncPRC1.

Furthermore, have identified C17orf96/EPOP as new interactor of PRC2 complex. This complex also contains Elongin B (EloB, or TCEB2) and Elongin C (EloC, or TCEB1).

The genome-wide analysis of the EPOP was reported in our previous annual report. This year we have fully analyzed the EloBC dimer in ES cells. ChIP-seq of endogenous EloB identified around 1000 significant peaks with highly enriched binding of EloB in WT mES cells (Figure 7).

The vast majority (752 out of 872 genes, 86%) of the EloB targets were also EPOP targets. Strikingly, the EloB ChIP signal at the EPOP-PRC2–occupied genes was drastically reduced in cells depleted of EPOP (Figure 5), in line with the biochemical observation that EPOP is required for efficient chromatin association of EloB.

To understand the mechanistic basis for the requirement of EPOP and Elongin BC for low expression of PRC2 targets, we tested the effects of EPOP depletion on PRC2 binding to chromatin. In contrast to the function observed for other PRC2-associated factors like JARID2, EPOP is not required for PRC2 binding to PcG targets. Rather, we observed an increase in ChIP signal for both SUZ12 and the H3K27me3 mark on PRC2 targets after EPOP depletion (Figure 8).

Because EPOP and JARID2 do not associate simultaneously with PRC2 yet occupy largely overlapping target sites on the genome (Figure 9), we also determined the ChIP signal of JARID2 after depletion of EPOP.

We also reported that Elongin BC regulates PRC2 binding to chromatin. Depleting Elongin BC by siRNA for 48 hr led to SUZ12 levels increased by more than 1.5-fold at 4542 targets, largely phenocopying the effect of EPOP depletion (Figure 10), showing that the function of EPOP and Elongin BC to impede excessive PRC2 binding shapes the transcriptional output.

On the other hand, depletion of Elongin BC resulted in roughly equal numbers of genes up- or downregulated, indicating that the Elongin BC heterodimer has diverse roles in regulating transcriptional output.

Combination of expression and ChIP-seq results revealed that these genes exhibited a strongly decreased ChIP signal of EloB after EPOP depletion, whereas the 92 genes with an increased expression were independent of EloB (Figure 11). This indicates that the function of EPOP in bringing Elongin BC to PcG targets is indeed linked to the transcriptional status of these genes.

Single molecule fluorescence microscopy
Our original aim was to develop two-colour single-molecule co-localisation imaging for the study of protein complexes. However, during the course of this project we have realised that, because many proteins co-localise to chromatin at sites of transcription, co-localisation does not necessarily help us understand the assembly and function of chromatin complexes. We have therefore found it more informative to carry out live-cell tracking of single proteins in the presence/absence of other components in order to understand the role of protein interactions within a complex. In this project, we have thus developed both fixed and live cell 2D and 3D single-molecule imaging (PALM and STORM) in mouse ES cells. This has allowed us to study either: 1) co-localisation and clustering of proteins within the nucleus; or 2) the role of protein interactions within a complex by tracking single proteins in the presence/absence of other components.

Development of fixed and live cell 2D and 3D single-molecule imaging (PALM and STORM)
For 2D single-molecule imaging, UCAM (Laue) in collaboration with Prof David Klenerman (Cambridge, UK) has built an nTIRF microscope. It incorporates 4 laser lines (405, 488, 561, 640 nm) and a dual view camera for multiple fluorophore imaging in either PALM (using mEos3.2) or STORM mode (using AF647). We have imaged a range of nuclear proteins such as the centromeric histone H3 variant CENP-A, the heterochromatin proteins HP1β and HP1γ, a nuclear periphery marker (lamin B2), and NuRD components (MBD3 and CHD4) (Figure 12). To determine co-localisation in ES cells on a 2D microscope, it is critical to ensure that localised molecules are within the detection plane (i.e. localised together) and not above or below this plane. We therefore developed and published an optical sectioning technique. To study single ES cells, we also developed and published a novel microfluidic device capable of trapping cells for super-resolution imaging before releasing them for further processing. To achieve 3D super-resolution imaging, the Klenerman group built a microscope with double-helix point spread function (DH-PSF) detection and adapted it for mammalian imaging (Carr et al, submitted). We have imaged single mEos3.2 and HaloTag®-JF549 dye-labelled CENP-A and CHD4 proteins with high x, y and z resolution over a 4 µm range in z (Figure 13). We have used these microscopes to study the following:

Clustering of NuRD proteins within the nucleus
Taking advantage of the expertise of UCAM (Hendrich), knock-in cell lines were generated in which MBD3 and CHD4 were tagged at the C-terminus with mEos3 and the HaloTag®. Super-resolution imaging of the NuRD components MBD3 and CHD4 was demonstrated and published (Figure 14). CHD4 and MBD3 molecules form distinct foci in live ES cells suggesting that the NuRD complex forms locally enriched clusters. Analysis of these protein clusters has revealed that CHD4 binds to a considerably higher number of chromatin sites in the nucleus than MBD3, consistent with reports that it has significant non-NuRD functions (Stevens et al, in press).

Single-particle tracking in the presence/absence of other components to understand the role of protein interactions within a complex
If NuRD components co-localise/interact, the dynamics of specific proteins within the complex will be affected by the absence of other components. We investigated whether CHD4 dynamics is affected by the presence/absence of MBD3 using the knock-in cell lines described above and published this work (Figure 12). We measured CHD4 dynamics using PALM of mEos3.2-tagged CHD4 at 10 ms time resolution in both wild-type and MBD3-null cells. CHD4 exhibits two major diffusion coefficients (0.100 ± 0.008 and 0.65 ± 0.04 µm2s-1). In MBD3-null cells, CHD4 still exhibits the slowly diffusing fraction (0.100 ± 0.008 µm2s-1), but the fast fraction now moves significantly more quickly (0.80 ± 0.08 µm2s-1) (p < 0.004). Since our published paper, we have carried out a similar analysis using our 3D microscope. Longer trajectories were achieved by combining the reduced photobleaching of HaloTag® dye-labelled CHD4 (compared to mEos3.2) and 3D tracking to prevent molecules diffusing out of the detection plane (Figure 13) (Carr et al, submitted). This study not only confirmed our previous 2D tracking results but allowed mean squared displacement analysis of the trajectories. The slowly diffusing fraction exhibits constrained diffusion over time and so appears to be stably bound to chromatin, whereas the fast fraction (Figure 13D) exhibits Brownian diffusion. Overall, the results suggest that a fraction of CHD4 associates with chromatin in either the presence or absence of MBD3, whereas the freely diffusing fraction of CHD4 molecules is significantly affected by the loss of MBD3 consistent with a lack of interaction with the NuRD complex. This demonstrates the feasibility of studying interactions within the NuRD complex using a combination of PALM and knockout experiments.

In collaboration with Prof David Klenerman (Cambridge, UK) and Anthony Carr (Sussex, UK), we have also used and published a live-cell motion blurring approach that can detect proteins bound to chromatin in yeast. This approach uses long exposures to blur proteins that are diffusing so as to only detect chromatin bound proteins. We are now applying this to the detection of NuRD proteins in mouse ES cells to ask whether NuRD is required for stable binding of CHD4 (residence time changes) or simply for CHD4 association with chromatin (the number of non-specific binding events before a stable interaction will change) in a similar manner to which Sox2 has been shown to facilitate Oct4-DNA interactions.

Figure 12. 2D single molecule tracking of single mEos3-tagged CHD4 molecules in live mouse ES cells. Representative images of the same cell are shown (left) using low power 488 nm excitation (green form of mEos3) and 405/561nm excitation (photo-activated red form of mEos3). A small number of the individual tracks from this cell are also shown indicating the fast and slow diffusing fractions of CHD4. The exact diffusion coefficients extracted from the data and the differences observed in the diffusion of a sub-population of CHD4 molecules in the presence (WT) and absence of MBD3 (MBD3-null) are shown in a box-and-whisker plot (right) for the two cell lines.

Fixed cell two-colour single-molecule co-localisation imaging approaches. We have also developed multicolour super-resolution methods that allow us to co-localise different proteins and RNA molecules within one cell. A proof-of-principle experiment was carried out in S. pombe cells to demonstrate the feasibility of two-colour imaging. We labelled one protein with mEos2 and another with the Atto655 dye using a HaloTag® fusion and demonstrated co-localisation of mEos2-tagged CENP-A and Mis6-HaloTag®-Atto65. We have also carried out two-colour co-localisation using mEos3.2-tagged proteins in combination with Alexa Fluor 647-tagged antibodies. Finally, we have explored labelling proteins and RNA molecules using different activator-emitter dye combinations, in which illumination of an activator dye is used to stochastically activate an emitter dye.

High-throughput genetic screening for components involved in PRC and NuRD- dependent transcription

In order to identify new proteins that regulate PRC-mediated transcription, we have collaborated with Jeroen Krijgsveld’s team of functional proteomics based at the EMBL of Heidelberg using a novel mass spectrometry-based approach called “SICAP”, with the aim of unravelling the in vivo chromatin environment and composition of PRC2-bound CGI islands in mouse Embryonic Stem (ES) cells and differentiated cells.
SICAP method combines the x-linked Chromatin Immuno-Precipitation (ChIP) technique with protein analysis by mass-spectrometry. We combined SICAP analysis of Suz12 targets in mouse ES cells with Stable Isotope Labeling by Amino acids in Cell culture (SILAC). This allowed us to quantitatively compare SICAP analyses in Suz12 Wild-Type versus Suz12 Knock-Out mouse ES cells and in this way, to specifically enrich for PRC2-associated proteins over ChIP background. The data showed that this approach allowed us to identify all PRC2 core-components and known PRC2-associated factors among the most enriched and abundant proteins. In addition, this experiment allowed us to identify a recently identified PRC2-associated factor (Gm340), which the function remains unknown, and another uncharacterized protein, NP45104.
Currently we are currently developing molecular tools to investigate the putative role of these 2 factors in PcG function in mouse ES cells. Indeed, antibodies and CRISPR-mediated knockout cell lines are under development for both Gm340 and NP45104. This will allow us to investigate if they have a role in reguating PRC recruitment and catalytic activity.

1.2. SUBPROJECT 2: Polycomb repressive complex & NuRD Structural Biology

3.2.1 Structures of NuRD components and complexes

Structure of CHD4 using SAXS and chemical cross-linking/MS
We expressed a large number of constructs of the CHD4 component of NuRD (containing different combinations of domains) in insect cells, and purified these proteins. We used SAXS, nucleosome binding, ATPase and remodelling assays, limited proteolysis, and cross-linking/MS, to generate a three-dimensional structural model describing the overall shape and domain interactions of CHD4.
Structure of RbAp48/Histone H3-H4 complex using SAXS and chemical cross-linking/MS
SAXS and chemical cross-linking/MS were also employed to generate a three-dimensional structural model of RbAp48 in complex with Histones H3-H4 (Figure 15). The identified cross-links were used as distance restraints for rigid body docking of the 3D coordinates of the individual subunits. Each potential model of the complex was ranked based on its fit to the experimentally derived SAXS profiles of the complexes and components. We refined the method by calculating the centres of mass of the SAXS envelopes generated for each component and generating additional distance restraints between these points for use in rigid body docking. We carried out non-denaturing MS studies with the Sobott group, and FRET experiments to validate our model of the RbAp48+H3+H4+ASF1+HAT1 complex.

Structural studies of the RbAp48/MTA-1 complex
In collaboration with Joel Mackay (Univ. of Sydney) we identified a C-terminal motif in MTA1 (resides 656–686), which is sufficient to bind full length RbAp48, and we subsequently solved the three-dimensional structures of three different complexes of RbAp48 bound to this region of MTA1 (Figure 2). Competition experiments using immobilized GST-H4(1-48) to pull down RbAp48 in the absence and presence of full-length MTA1 confirmed that MTA1 and H4 recognize the same site on RbAp48. These data support the conclusion that histone H4 and MTA1 contact overlapping surfaces on RbAp48 and indicate that their assembly into the NuRD complex modulates RbAp46/48 interactions with histones (Alqarni et al., 2014). Subsequently we showed that MTA1 contains a second similar RbAp46/48 binding site near the centre of the protein, supporting the finding from proteomics experiments that each MTA1 subunit binds two RbAp46/48 molecules (Figure 16).

Models of intact NuRD complexes
We have pursued two strategies to obtain structures of intact NuRD complexes. First, we analysed endogenous NuRD complexes affinity purified via GFP-tagged MBP-subunits from human and Drosophila cells. Secondly, by systematically producing recombinant combinations of the different subunits we identified different NuRD sub-complexes and we were able to reconstitute a close to complete Drosophila NuRD complex. The assembly pathway established here for intact NuRD complexes is likely to be conserved in humans and can thus be applied to generate homogenous human NuRD complexes with a defined subunit composition.
The endogenous Drosophila NuRD complex comprised the subunits p55, MTA-like, MBD-like and Rpd3 (PMMR), and contained only sub-stoichiometric amounts of Mi-2, Simjang and CG18292. Thus, PMMR is the core Drosophila NuRD complex. Proteomics analysis carried out in collaboration with the Vermeulen group suggested that the stoichiometry of the core NuRD (PMMR) complex is 4:2:1:2 with a single MBD-like subunit, which we confirmed using Multi-Angle Light Scattering of the recombinant complex. When we produced the recombinant NuRD complexes in insect cells, we found that stable sub-complexes comprising p55 and MTA-like (PM), MTA-like and Rpd3 (MR), as well as PMR exist. MBD-like appears to be added last to produce PMMR, in a process that likely mirrors the in-vivo assembly pathway. We were able to show that PMR, PMMR and endogenous NuRD all exist in both the cytoplasm and nucleus and have very similar deacetylase activities and specificity. This has important implications for understanding NuRD complex function.
We calculated reference-free initial models for all three sub-complexes, which we subsequently refined (Figure 17). The PMMR and PMR 3D reconstructions resemble the endogenous Drosophila NuRD core volume in size and overall shape. PMR shows a structure that is clearly divided in two halves. When the two halves are superimposed, they show a correlation of 95% (as determined by the software Chimera). This is consistent with PMR being composed of two copies each of MTA-like and Rpd3, together with four copies of p55. The PM reconstruction can be fitted in either half of the PMR reconstruction with a correlation coefficient of 90%. The remaining density is attributed to Rpd3.

The recombinant PMMR was further analyzed by cryo-EM. Initial analysis of the dataset revealed 2D class averages showing high-resolution features/secondary structure elements that clearly relate to the crystal structure of the MTA-HDAC/Rpd3 dimer (Figure 18) solved by the Schwabe group. However, many other 2D class averages showed only low-resolution features indicating that the dataset is still heterogeneous. We are currently attempting to overcome the heterogeneity of the NuRD complexes by biochemical stabilization strategies.

The other subunits of the complex – CHD4/Mi-2, GATA/Simjang and DOC1/CG-18292 – appear to loosely interact with the core PMMR complex. A sub-complex comprising CHD4 and GATA can be formed, defining two distinct functional entities – CG and core PMMR. We managed to reconstitute a close to complete NuRD complex by first binding GATA and then CHD4 to the PMMR sub-complex. These complexes are currently being analysed in detail by negative stain EM.

Studies of NuRD complex-Nucleosome interactions
We have reconstituted mono- and di-nucleosomes with 601 DNA and DNAs from endogenous NuRD binding sites identified in CHIP-Seq experiments by the Hendrich group. We performed electrophoretic mobility shift assays (EMSA) to verify the interaction of NuRD with nucleosomes. Mono-nucleosomes interact with CHD4 as the bands shifted to a slower migration, but do not interact with PMR and PMMR (Figure 19A). We further investigated the binding preferences of CHD4 with different nucleosomes using pull-down assays (Figure 19B). We repeatedly see that CHD4 binds better to di-nucleosomes with a larger gap, such as the Ppp2r2c enhancer sequence, than it does to conventional nucleosomes reconstituted on to a Widom 601 DNA sequence.

Currently we are purifying di-nucleosomes from cultured cells to perform an in vitro pull-down assay with assembled holo-NuRD complex, with the aim of identifying which specific genes NuRD binds to with the highest affinity. We then plan to reconstitute recombinant nucleosomes and make NuRD-Nucleosome complexes for both cryo-EM structural studies and single-molecule FRET assays to study remodelling function.

Screening NuRD complexes for small molecule inhibitors
The objective of this part of the study was to assess the feasibility of identifying chemical lead molecules able to bind and perturb the function of isolated intact NuRD complexes.
We focussed on screening the core PMMR NuRD complex (which contains the histone deacetylase) and components of the NuRD complex that our structural studies had suggested might provide suitable druggable binding sites, in particular CHD4 (the chromatin remodeller) and RbAp48 (a histone chaperone). Two different CHD4 constructs (comprising the his-PP-CC and his-PP-CC-AH-D domains), and three different variants of RbAp48 (wild-type, histone H3-binding site mutant, and histone H4-binding site mutant) were produced. The Chung group at GSK using their Encoded Library Tag (ELT) technology and libraries to screen all these constructs for lead molecules. We have identified hit molecules that bind in repeated experiments, and our aim is now to further study selected hits using either fluorescence polarisation or HTRF/TR-FRET assays to confirm binding.
For CHD4 we have found that the affinity of the PHD domains for a N-terminal histone H3 peptide (1-20) increases in the presence of ATP (Watson et al., 2012). One possible explanation is that upon the binding of ATP to the ATP-binding site in the distal AH domain, conformational changes release the tandem PHD and chromo-domains, resulting in a higher binding affinity for H3. We therefore intend to screen the his-PP-CC-AH-D construct in the presence of a non-hydrolysable nucleotide (AMP-PNP) and/or the histone H3 binding peptide to favour identification of lead molecules binding to different conformational states of the protein.
Likewise, the RbAp48 histone chaperone has been crystallized both alone and with different binding peptides (FOG1, histones H3 and H4, MTA1) that result in conformational changes to the proteins overall structure. We noticed conformational changes in RbAp48 in which a flexible 25 residue loop in proximity of the N-terminal alpha helix becomes structured on binding histone H3. This suggests that ELT screening of RbAp48 in complex with the H3 binding peptide might in the future help to identify novel molecules.

3.2.2 Structures of Polycomb group complex components and Polycomb group complexes

Model of the Drosophila Polycomb Repressing Complex (PRC)
The most complete structural description has been achieved for the Drosophila melanogaster PRC complex, where joint efforts of the Jürg Müller (MPI Munich) and the Christoph Müller (EMBL Heidelberg) groups have led to a model how PRC serves as bridging complex between Drosophila Polycomb Response Elements (PREs) and PRC1 (Figure 20).

Figure 20 (adapted from Frey et al.): Molecular model of interactions with which PhoRC tethers canonical PRC1 to PREs. The Pho spacer region (purple) binds to the 4MBT domain of Sfmbt (orange) to form PhoRC (PDB ID 4C5I). The Sfmbt SAM domain binds to the ML surface of the Scm SAM domain (PDB ID 5J8Y)9. The EH surface of Scm SAM in turn binds to the ML surface of the Ph SAM domain (PDB ID 1PK1). Other earlier determined structure such as the ZnF domain of the Pho orthologue YY1 in complex with DNA (PDB ID 1ubd), 2MBT domain of Scm (PDB ID 2R57) are also shown.

Model of the PRC2 complex
For PRC2, efforts of the Müller group have been focusing on the co-expression in insect cells and crystallization of a ternary human subcomplex comprising EZHZ, the C-terminal moiety of SUZ12 and EED with different regulatory peptides. Crystals of this ternary complex were obtained that currently diffracted to about 4.0 Å (Figure 21).

In spring 2016 two competing groups published crystal structures of a ternary PRC2 complex containing very similar constructs of EZHZ, the C-terminal moiety of SUZ12 and EED as identified by our group. The crystal structure of the ternary PRC2 complex is depicted in Figure 22. The ternary structure can be also fitted into a negative-stain electron microscopy reconstruction of a pentameric PRC2 complex that comprises EZHZ, SUZ12, EED, RbAp48 and AEBP2. Pentameric PRC2 complexes were analyzed by native MS (Frank Sobott). Both, monomeric (303,351.4 Da) and dimeric PRC2 are detected side by side. Collisional activation inside the mass spectrometer is used to confirm this assignment, with RbAp48 dissociating first indicating that it binds least tight with the other subunits.

In parallel, the Müller group is also reconstituting PRC2 complexes bound to modified nucleosomes. Native MS in the Sobott group confirms that mono- and dimeric PRC2 pentameric complexes can bind to intact nucleosomes (1:1 complex: ca. 828 kDa). To preserve the integrity of PRC2-mononucleosome complex and increase its stability, the complex was cross-linked using the GraFix protocol. Subsequently, a single particle tomography dataset was collected on a Polara microscope and 2D classifications was performed (Figure 23, left). The selected 2D class averages were used for 3D model classification. The generated models appeared to have larger dimensions than PRC2 or nucleosome alone (Figure 23, right), indicative of PRC2-mononucleosome complex formation. These models form the basis for the current efforts of the Müller group to obtain a high-resolution structure of the PRC2-mononucleosome complex using cryo-EM.

In addition, the Jürg Müller group (MPI Munich) determined the crystal structure of a fragment of the PRC2 associated factor Polycomblike (Pcl) from Drosophila (Figure 24). Pcl, and its mammalian orthologues Pcl1/PCL1, Pcl2/PCL2 and Pcl3/PCL3 associate with PRC2 and are required for PRC2 anchoring at Polycomb target genes in the genome. The Jürg Müller lab showed that this part of the protein binds to DNA in a sequence non-specific manner with mid-micromolar affinity. Structure-guided mutations were then used to investigate how point mutations of residues that are critical for DNA binding by Pcl affect the nucleosome-binding activity and histone methyltransferase activity of Pcl-PRC2 assemblies in vitro. These studies established that the DNA-binding activity by Pcl/PHF1 protein plays a critical role for efficient H3K27 tri-methylation.

Model of the PR-DUB complex
The Müller group has continued to attempt to obtain a crystal structure of PR-DUB. Construct design was informed by biochemical characterisation of human PR-DUB constructs and a 3D structural model of human PR-DUB bound to ubiquitin based on the structure of Bap1-related protein UCH-L5 in complex with activating protein Rpn13 and with ubiquitin bound (Figure 25).
PR-DUB/nucleosome complexes have been visualized by negative stain EM. However, problems in processing the data arose due to the presence of nucleosomes without PR-DUB bound in the sample. An improved protocol has been developed for the purification of PR-DUB in complex with ubiquitinated nucleosomes to produce a homogenous solution of two PR-DUB molecules bound to a ubiquitinated nucleosome (Figure 25).

Native MS and ion mobility of intact nucleosomes
Native MS was used to assess homogeneity of reconstituted nucleosomes (from the Bartke and Christoph Müller groups), and to characterize their folding state using ion mobility. A typical mass spectrum of nucleosomes (222 kDa; measured at 10 µM protein in 100 mM aqueous ammonium acetate buffer) shows complete assembly with a small excess of free 113 kDa DNA visible. The approach is suited to monitor assembly and partly sequence modified nucleosomes (with defined PTM state or PTM mimetics), and to provide a platform for rapid screening of possible interaction partners (Kd range low M-nM).

Preparative mass spectrometry
Native MS is able to separate lowly-populated ion species (e.g. partly assembled or heterogeneous complexes) in a narrow m/z window (Figure 26). When MS is combined with ion mobility separation, it also provides separation based on shape, resolving conformational or topological heterogeneity. Separated species can then be deposited onto a surface by preparative soft-landing, without affecting their composition. The Sobott group have coupled soft-landing MS with negative-stain EM, confirming the structural integrity of protein assemblies after deposition onto a target. This indicates the potential of the approach for high-resolution structure determination of heterogeneous assemblies. It has now also been shown that shape/size separation is possible using a fast-switching timed gate immediately after the ion mobility cell, by modifying a commercial instrument.
As little as 1.5-2% global size difference (by collision cross section) can be detected. We tested, amongst others, shape/size selection of highly dynamic SMC dimers (Structural Maintenance of Chromosomes protein, b. subtilis). Corresponding EM and AFM data were collected in collaboration (Grenoble/Schaffitzel group, Bristol and Madrid) to define the corresponding structural variation of the protein. Coupling preparative MS with down-stream cryo-EM analysis will offer an elegant solution to unravelling biological heterogeneity complexity.

1.3. SUBPROJECT 3: Polycomb repressive complex & NuRD Data Integration

3.3.1 Integrated overview of PRC and NuRD complex function during cellular differentiation.

Using a compendium of genome-wide expression data from differentiating mouse embryonic stem (ES) cells under various experimental perturbations (from P1a, Di Croce lab), we expanded this goal to encompass a global view of protein complex diversity and temporal dynamics. Results from our approach, which uses a staging technique based on principle component analysis (PCA) to efficiently merge expression data from heterogeneous genetic backgrounds, (i) confirmed the peculiar dynamics of PRC1 subunits during development and (ii) yielded a list of candidate complexes with unusually high levels of subunit switching for potential follow-up investigation (Figure 27).

To identify and understand the influences on expression variation in mouse embryonic stem (mES) cells we integrated diverse sequence and functional genomic datasets. In our analysis we combined previously published RNA-seq data from single mouse embryonic stem cells (ESCs) with definitions of promoter types, chromatin states and chromatin domains from the matched cell type.

Our aim was to determine whether sets of genes with similar properties tend to have unusually high or low expression noise at the transcriptional level. This revealed that the core promoter architecture of a gene is an important influence on the stability of expression across individual cells, with CpG island promoters and broad transcription initiation associated with low noise and a TATA box and sharp initiation associated with increased noise. However, the expression variability of each gene is also strongly linked to its chromatin architecture. In general, genes with high levels of activation-associated modifications have low expression variability across cells, with H3K36me3 in particular associated with low noise. Conversely, the presence of repressive modifications on expressed genes is associated with increased noise, suggesting a model where “conflicting” chromatin architectures – the absence of active modifications or the presence of repressive ones – increase noise (Figure 28).

In contrast, broad domains of active enhancer chromatin – super-enhancers – actually increase variability by sensitizing expression to fluctuations in the pluripotency network. Even in 2i conditions fluctuations in pluripotency can be detected in the expression of a large number of genes.

Finally, we identified that unusually high mRNA stability of X-linked genes as a mechanism that compensates for the high noise associated with expression from the single copy of this chromosome. Our results provide an integrated view of how core promoters, chromatin and post-transcriptional regulation tune the stability of gene expression across individual stem cells.

3.3.2 Open access to all datasets generated by the consortium and an analysis tool to facilitate the analysis and integration of these datasets by wet-lab biologists

To date, 87 publications have resulted from the 4DCellFate project and 50 of these publications are open access. In addition, all of the datasets generated by the consortium and submitted for publication have been deposited in the appropriate open access databases.

Based on the CLC bio Genomics Workbench (GxWB), we implemented extensions for integrative visual analytics of genomic track-data. The software integrates genomic datasets from both external sources and data generated within the consortium. With the new track-tools, computation and visualisation are handled in a seamless graphical framework that does not require any coding from the user.
Through discussions with consortium partners we recognised the identification of functional elements as peaks in genomic data as the central algorithmic challenge. This is addressed by a novel shape-based peak-detection algorithm that works in a robust manner while requiring only a minimum of parameterisation. The resulting combination of an interactive graphical approach with highly automated signal-detection algorithms makes the software ideally suited for data analysis by non-bioinformaticians. All results can be shared and exchanged between consortium members in the form of CLC-file objects.

A flexible back-end has been implemented for the CLC Genomics server that enables integration of systems-biology datasets and other bio-molecular networks. This way network data is made accessible for querying and interactive visualisation through the graphical user-interface (Figure 29).

1.4. SUBPROJECT 4: Polycomb repressive complex & NuRD In Disease

Characterisation of human NuRD and PRC complexes

Genome engineering was used to modify one MBD3 allele in human iPS cells facilitate detection of the NuRD complex in human cells. Using CRISPR/Cas9, we modified one MBD3 allele such that it now encodes for an MBD3 protein with a two-fold epitope tag at its C-terminus (Figure 30). After verification, this cell line was used to define components of the NuRD complex in human pluripotent cells. The epitope tag was used to purify native NuRD complexes and interacting proteins which were then subjected to quantitative mass spectrometry. This allowed us to both calculate the stoichiometry of NuRD in human cells and to identify potential interacting proteins.

PRC complexes have been well characterised in leukemic cells, so our focus was to identify key interacting proteins which can direct PRC action to specific sites. One such interactor is PHF19, identified by Partner 1a. Chromatin binding sites for PHF19 were determined in human leukemic cells (Figure 31), and this was compared with transcriptomics data to learn that PHF19 occupies promoters of both active and inactive genes in human leukemic cells, consistent with it playing a role in directing PRC2 activity in this cell type.

Expression patterns of interacting proteins
The expression patterns of two different isoforms of the PHF19 protein were determined in several human leukemic cell lines. Depletion of the long isoform triggered spontaneous cell differentiation. Subsequent RNAseq analysis in two different leukemic cell lines revealed widespread changes in gene expression, consistent with PHF19 being a subunit of the Polycomb repressor complex 2. Based on these results, we initiated a screening for the identification of small chemical compounds to inhibit PHF19. This screen is ongoing.

Human reporter lines.
In addition to the MBD3-3xFLAG human iPS cell line described above, Partner 9 also created a tagged CHD4 allele in the same iPS cell parent line. This strategy combined a FLAG tag with fluorescent proteins to enable its use in microscopy experiments (Figure. 32). This line was constructed and delivered to Partner 2b, where it was independently verified. This resource will be of use in downstream applications.

Small molecules to interfere with PRC function
We investigated the action of two different inhibitors of the PRC2 complex: GSK343 (selective for EZH2) and UNC1999 (targeting both EZH1 and EZH2. The inhibitors have been tested on human APL cell lines (NB4 and UF1, the latter resistant to RA induced differentiation), murine APL leukaemia, and Lin- cells from wild-type and PML-RAR knock-in mice (pre-leukemic phase of APL).
At the beginning of the treatment both molecules showed a strong inhibitory effect on cellular growth and colony formation which correlate with an induction of differentiation, further potentiated by retinoic acid co-treatment. Importantly, UNC1999 had a stronger effect compared to GSK343. Both molecules had a milder phenotype on Lin- cells derived from wild-type mice in comparison to the PML-RAR expressing cells, suggesting a possible therapeutic window.
At later time points (after several passages in methylcellulose) we observed the emergence of resistant clones in particular from cells expressing PML-RAR at all the stages of the disease (pre-leukaemia and fully established leukaemia), suggesting that continuous inhibition of EZH1/EZH2 may select for cells with the ability to sustain in vitro growth and therefore may have the potential of a leukemia initiating cell (LIC). This was verified using animal experiments. These observations may explain the rapid relapse of APL mice upon treatment discontinuation. These preclinical studies suggest that EZH2 inhibition improve survival but prolonged treatment enriches leukemia initiating cells. (Figure 33).

Previous results on the role of CBX7 in AML maintenance suggested that this PRC1 component is fundamental for maintenance of leukemic cells. We therefore sought to further validate CBX7 as a target in AML, and then to identify novel small molecules able to interfere with its function. Using two different AML models, knockdown of Cbx7 reduced both cell growth and clonogenicity, suggesting that CBX7 is indispensable for AML cells survival/proliferation. To further characterize the CBX7 domain required for the observed phenotype, we generated a CBX7 mutant in the chromodomain (CBX7K31A/W32A named CBX7AA) which is unable to bind H3. MLL-AF9 cells expressing the mutant CBX7AA showed a decreased ability to form colonies compared to cells expressing wild type CBX7. Importantly, colonies recovered 7 days after plating in methylcellulose were negative for the expression of CBX7AA, while wild type CBX7 was still expressed. These results suggest that CBX7AA acts in a dominant negative fashion and that CBX7 activity depends on its binding to histones through the chromodomain. To investigate the requirement of CBX7 in leukaemia maintenance, human primary AML cells were knocked-down for Cbx7 and transplanted into NSG mice. Leukaemia progression was significantly delayed in mice transplanted with Cbx7 knockdown leukemic cells (Figure 34). Importantly, in this experimental setting 3 mice out of 8 transplanted with leukemic cells knocked down for Cbx7 failed to develop leukaemia.
The results obtained so far identify CBX7 as a relevant target which, through its interaction with H3K27Me3 (mediated by its chromodomain), is required for leukaemia maintenance. In order to select molecules able to interfere with the association between Cbx7 and H3K27Me3, in collaboration with several groups (including Partner 1a) we exploited the new NanoBret technology to develop cell-based assays to identify small molecules able to interfere with the binding in cells and performed virtual screening to identify inhibitors in vitro of the interaction of the isolated Cbx7 chromodomain with a peptide derived from H3. Overall, the results of our small molecule activities have led to the identification of at least a dozen compounds (obtained through different approaches) that are at different stages of characterization in the drug discovery chain of activities (hit confirmation-hit to lead). Since there are essentially no validated small molecules able to interfere with the chromodomain so far, our studies represent a clear advancement in the demonstration of the drugability of this domain, that we are also showing to be critical for the oncogenic activity of CBX7 in AMLs.

Improving differentiation protocols of human pluripotent cells
We identified SALL4 as a component of the NuRD complex in human pluripotent cells. Using our knowledge about SALL4 function in mouse cells, we designed a strategy to create an improved neural differentiation protocol in human pluripotent cells. By knocking down SALL4 transiently, we find we can prime cells to enter the neural lineage (Figure 35). This will provide a more efficient neural differentiation protocol which should be amenable to the application of small molecule inhibitor screens on the way towards creating a more robust and generally applicable differentiation protocol.

Potential Impact:
1.1. Socio-economic impact and exploitation of results

Technical development:

Single molecule fluorescence microscopy
Exploiting two-colour PALM and STORM methods to study PRC/NuRD co-localisation in mouse ES cells. Our original aim was to develop two-colour single-molecule co-localisation imaging for the study of protein complexes. However, during the course of this project we have realised that, because many proteins co-localise to chromatin at sites of transcription, co-localisation does not necessarily help us understand the assembly and function of chromatin complexes. We have therefore found it more informative to carry out live-cell tracking of single proteins in the presence/absence of other components in order to understand the role of protein interactions within a complex. In this project, we have thus developed both fixed and live cell 2D and 3D single-molecule imaging (PALM and STORM) in mouse ES cells. This has allowed us to study either: 1) co-localisation and clustering of proteins within the nucleus; or 2) the role of protein interactions within a complex by tracking single proteins in the presence/absence of other components.

Development of fixed and live cell 2D and 3D single-molecule imaging (PALM and STORM)
For 2D single-molecule imaging, UCAM (Laue) in collaboration with Prof David Klenerman (Cambridge, UK) has built an nTIRF microscope. It incorporates 4 laser lines (405, 488, 561, 640 nm) and a dual view camera for multiple fluorophore imaging in either PALM (using mEos3.2) or STORM mode (using AF647). We have imaged a range of nuclear proteins such as the centromeric histone H3 variant CENP-A, the heterochromatin proteins HP1β and HP1γ, a nuclear periphery marker (lamin B2), and NuRD components (MBD3 and CHD4). To determine co-localisation in ES cells on a 2D microscope, it is critical to ensure that localised molecules are within the detection plane (i.e. localised together) and not above or below this plane. We therefore developed and published an optical sectioning technique (Palayret et al, 2015). To study single ES cells, we also developed and published a novel microfluidic device capable of trapping cells for super-resolution imaging before releasing them for further processing (Zhou et al, 2016). To achieve 3D super-resolution imaging, the Klenerman group built a microscope with double-helix point spread function (DH-PSF) detection and adapted it for mammalian imaging (Carr et al, submitted). We have imaged single mEos3.2 and HaloTag®-JF549 dye-labelled CENP-A and CHD4 proteins with high x, y and z resolution over a 4 µm range in z. We have used these microscopes to study the following:

Clustering of NuRD proteins within the nucleus. Taking advantage of the expertise of UCAM (Hendrich), knock-in cell lines were generated in which MBD3 and CHD4 were tagged at the C-terminus with mEos3 and the HaloTag®. Super-resolution imaging of the NuRD components MBD3 and CHD4 was demonstrated and published. CHD4 and MBD3 molecules form distinct foci in live ES cells suggesting that the NuRD complex forms locally enriched clusters. Analysis of these protein clusters has revealed that CHD4 binds to a considerably higher number of chromatin sites in the nucleus than MBD3, consistent with reports that it has significant non-NuRD functions (Stevens et al, in press).

Single-particle tracking in the presence/absence of other components to understand the role of protein interactions within a complex. If NuRD components co-localise/interact, the dynamics of specific proteins within the complex will be affected by the absence of other components. We investigated whether CHD4 dynamics is affected by the presence/absence of MBD3 using the knock-in cell lines described above and published this work (Zhang et al, 2016). We measured CHD4 dynamics using PALM of mEos3.2-tagged CHD4 at 10 ms time resolution in both wild-type and MBD3-null cells. CHD4 exhibits two major diffusion coefficients (0.100 ± 0.008 and 0.65 ± 0.04 µm2s-1). In MBD3-null cells, CHD4 still exhibits the slowly diffusing fraction (0.100 ± 0.008 µm2s-1), but the fast fraction now moves significantly more quickly (0.80 ± 0.08 µm2s-1) (p < 0.004). Since our published paper, we have carried out a similar analysis using our 3D microscope. Longer trajectories were achieved by combining the reduced photobleaching of HaloTag® dye-labelled CHD4 (compared to mEos3.2) and 3D tracking to prevent molecules diffusing out of the detection plane (Carr et al, submitted). This study not only confirmed our previous 2D tracking results but allowed mean squared displacement analysis of the trajectories. The slowly diffusing fraction exhibits constrained diffusion over time and so appears to be stably bound to chromatin, whereas the fast fraction exhibits Brownian diffusion. Overall, the results suggest that a fraction of CHD4 associates with chromatin in either the presence or absence of MBD3, whereas the freely diffusing fraction of CHD4 molecules is significantly affected by the loss of MBD3 consistent with a lack of interaction with the NuRD complex. This demonstrates the feasibility of studying interactions within the NuRD complex using a combination of PALM and knockout experiments.

In collaboration with Prof David Klenerman (Cambridge, UK) and Anthony Carr (Sussex, UK), we have also used and published a live-cell motion blurring approach that can detect proteins bound to chromatin in yeast (Etheridge et al, 2014). This approach uses long exposures to blur proteins that are diffusing so as to only detect chromatin bound proteins. We are now applying this to the detection of NuRD proteins in mouse ES cells to ask whether NuRD is required for stable binding of CHD4 (residence time changes) or simply for CHD4 association with chromatin (the number of non-specific binding events before a stable interaction will change) in a similar manner to which Sox2 has been shown to facilitate Oct4-DNA interactions (Chen et al, 2014).

Development of (new) Screening Platforms
During this project, we developed two different screening platforms to select small molecules able to interfere with CBX7-H3K27Me3 interaction.
1) Fluorescence polarization assay. This is an in vitro assay described in Simhadri et al., (J Med Chem. 2014 Apr 10;57(7):2874-83) based on fluorescence polarization. The assay is carried out using a synthetic complex composed by either a recombinant truncated form of CBX7, containing the chromodomain (amino acids 8-62), or the full-length Cbx7, together with a modified peptide derived from H3 (5(6)-carboxyfluorescein-QLATKAAR-Lys(Me3)-SAATG).
2) Nano BRET Assay is a commercial available assay developed by Promega. The Nano BRET Assay is a protein-protein interaction assay that uses Nano-Luc Luciferase as the BRET energy donor and HaloTag Protein labeled with the HaloTag Nano BRET 618 Fluorophore as the energy acceptor to measure the interaction of two specific proteins. Importantly, Nano BRET Assays allow the measurement of protein interactions in live cells leading to the selection of molecules which are membrane permeable and able to interact with the full-length Cbx7 protein embedded into the whole physiological PRC1 complex.

1.2. Wider societal implications of the project

Overall, the results from our small molecules screening of compounds (inhibiting CBX7-H3K27Me3 interaction) represent a clear advancement in the demonstration of the drugability of the chromodomain, that we are showing to be critical for the oncogenic activity of CBX7 in AMLs. Therefore, our studies mark the beginning of drug discovery activities against a novel class of targets in cancer.

Our studies using combinations of epigenetic drugs (LSD1, EZH2, HDAC inhibitors, retinoids) have demonstrated at the preclinical level that targeting both bulk leukemic cells and the leukaemia initiating cell compartment are both required for disease eradication, suggesting new clinical approaches for the treatment of AMLs.

1.3. main dissemination activities

It was a central goal of 4DCellFate to promote dissemination activities within and outside the consortium, to scientists, to the industrial sector and to the general public, in order to raise awareness of the project, of its results and its future impact on society. In order to obtain support in dissemination actions from experts in science communication, the project created the 4DCellFate Communication Network, a network of all the Communication Offices at the partner institutions which actively participated in specific actions.

Dissemination within 4DCellFate
The Project Coordinator and the project management team ensured that all partners shared their data and results through e-mails, the website, annual meetings, phone and skype conferences as well as specific internal workshops. An intranet in form of dropbox professional was set up and maintained by the project manager to improve communication among consortium members and included a structured and searchable document repository (collecting official documents, presentations, relevant literature, etc.).

Dissemination to the scientific community
All partners were committed to communicate their (sub) project results to the scientific community and to share resources, protocols and reagents with other laboratories outside the project. Results were published in peer review journals and communicated at international and/or national meetings and conferences. To date, each of the four subprojects has generated large amount of novel data, which led to numerous publications in top impact journals (such as Nature, Nature Structural and Molecular Biology, Oncogene, etc.): 85 articles were published with support of the 4DCellFate project, of which seven were collaborative within the project (Figure 36). These publications had an approximate average impact factor of 12.5 and were cited 1503 times in total (1389 times excluding self-citations). Eleven of the publications have been published in open access journals, in total 50 publications with open access (either green or golden way).

4DCellFate worked actively to join forces with other European initiatives. Organization of joint meetings or workshops were explored to exchange results, technologies and foster communication with those initiatives. One example is the co-organization of a workshop with the 4DGenome initiative in Barcelona. Through the 4DCellFate Communication Network, joint press releases were distributed to local media in the partner countries, to the international press and to the EC press office (together with proper acknowledgement of the funding bodies). A public website was set up in the first few months of the 4DCellFate project (www.4dcellfate.eu) offering different types of information for different potential readers and users (scientists, funding bodies, companies, general public, etc). It was designed following the EC communication guidelines and with the support of the 4DCellFate Communication Network. The website includes scientific information on project goals, organization, partners, main achievements, and common resources.

Dissemination to the general public and outreach activities
As a publicly funded project, 4DCellFate saw both the responsibility and the interest to freely communicate its innovative research and cutting-edge science to society, and to bring the questions, concerns and responses of the public back to the laboratory. This was achieved through different means and with the support of the 4DCellFate Communication Network. Most of the partner institutions in the project organized multiple outreach activities, such as open days, science & art activities, educational programmes for school teachers (see as examples ELLS at EMBL, and “BIO” at CRG), workshops for primary and secondary school kids, scientific cafes, and “easy” scientific lectures. The 4DCellFate Communication Network encouraged the partners (from principal investigators to PhD students) to get involved in such initiatives, and share their experience in the 4DCellFate network. P1a has hosted a series of experimental workshops for high school students at the CRG, reaching a total of 41 high school classes throughout Catalonia and more than 700 students for intense workshops in small groups, with a high student-to-scientist ratio. these experimental, 1-day workshops for late-stage high school students were held prior to the time when the students have to decide their "baccalaurate" topics. These workshops were intended to arouse the interest of the students in scientific research and biomedicine, with overviews about cancer and stem cells, at this critical time when they are deciding their future directions.

Furthermore, P1 organised on 16 September 2015 a Scientific Café dedicated to the topic “Epigenetics – beyond GATTACA”. On this public event with more than 80 participants, two researchers (one of them from P1 lab) presented their work and discussed towards the broad public the role of epigenetics on gene activity and potential applications to heal diseases like cancer.

Additionally, P1 has developed an experimental electrophoresis kit for introducing high school students to the basics of gene transcription, based on (and co-sponsored by) 4DCellFate, to be done on-site within the schools. These kits are highly economical (each for about 3O Euro) and can be distributed widely: in 2O14, 5O kits were sent out to reach about 15OO high school students within Catalunya. P1 is considering starting a company for this dissemination. Numerous other outreach activities have also occurred through this reporting period, as listed in Table 3, that have used catchy methods for getting the public enthusiastic about science as well as for explaining what they do. For instance, Science Speed Dating uses the popular social format to explain scientific careers to secondary school students.

The consortium worked furthermore together with the artist Ana Cid (Barcelona) on a project entitled “The Art of Science”, which will consist of artistic representations (etches) of images taken from the 4DCellFate project (for example, see Figure 37). These images were displayed on the 4DCellFate website with descriptions of the actual experiments, and an exhibit with A. Cid is being planned for the near future.

As one final joint outreach activity, the 4DCellFate consortium produced a short movie (10min), targeting students of Life Science and the interested broad public. 4DCellFate researchers themselves explained – through interviews and animations –the project, its activities and main results, their relevance and future challenges. The movie is uploaded on the youtube channel of P1 announced on the partner institutes websites and social media. To guarantee a high quality of the movie, a professional movie production company with experience in the scientific sector was subcontracted.

List of Websites:
Name of coordinator: CRG P1a, Luciano Di Croce
Tel: +34 93 316 11 32
E-mail: Luicano.DiCroce@crg.eu

Project website address: http://www.4dcellfate.eu
final1-4dcellfate-final-report.pdf