Interrogating native CRISPR arrays to achieve scalable combinatorial screens and dissect genetic redundancy

Project Information

CRISPRcombo

Grant agreement ID: 865973

Project website

DOI

10.3030/865973

EC signature date 19 December 2019

Start date 1 June 2020

End date 31 March 2026

Funded under

EXCELLENT SCIENCE - European Research Council (ERC)

Total cost

€ 2 000 000,00

EU contribution

€ 2 000 000,00

2 000 000,00

Coordinated by

HELMHOLTZ-ZENTRUM FUR INFEKTIONSFORSCHUNG GMBH
Germany

Periodic Reporting for period 3 - CRISPRcombo (Interrogating native CRISPR arrays to achieve scalable combinatorial screens and dissect genetic redundancy)

Reporting period: 2023-06-01 to 2024-11-30

Biology is full of redundancy. Whether in development, cellular signaling, metabolism or host-pathogen interactions, many genes appear to perform similar functions, and removing one gene seems to have little effect on the associated cellular process. However, emerging evidence suggests that these genes work together, where disrupting combinations of these genes can begin to reveal their contributions to cellular function. This Consolidator project aims to establish new means of unraveling biological redundancy using a less-utilized part of CRISPR technologies: CRISPR arrays. Best known for genome editing, CRISPR technologies are derived from bacterial immune systems with CRISPR arrays at their core. CRISPR arrays encode the guide RNAs that direct the CRISPR machinery to recognize different viral RNA and DNA sequences, posing the immune system to clear different potential invaders. The order of the arrays is determined by the order of infections, with guide RNAs against recent invaders on one end of the array and guide RNAs against invaders from long ago on the other end. These same arrays offer a compact means to target multiple redundant genes at once, allowing researchers to apply high-throughput combinatorial screens to identify gene sets that contribute to a cellular process or function. The challenge though is determining how to best implement CRISPR arrays, which begins with better understanding their biology and evolution.

The goal of this ERC Consolidator project is to elucidate and apply the properties of CRISPR arrays toward untangling redundancy in biology. The proposed work was broken into three distinct objectives centered around elucidating design rules for CRISPR arrays, understanding why CRISPR arrays do not undergo rearrangements despite their incredibly repetitiveness, and applying insights to interrogate one example of redundancy within gene regulation by bacterial small RNAs. In turn, this project is expected to deepen our understanding of an important yet understudied aspect of CRISPR biology and lay a foundation to interrogate redundancy in biology. Given the breadth of examples of biological redundancy, those efforts in turn could have wide-ranging impacts, from cancer treatment to antibiotic development and devising new means to treat genetic disease.

The work to-date has focused on delving into the natural features of CRISPR arrays and their accompanying CRISPR biology as well as laying a foundation to apply CRISPR arrays to interrogating biological redundancy.

Within the natural features of CRISPR arrays, we discovered that the region upstream of CRISPR arrays contributes to the production of the encoded guide RNAs. This region, called the leader, was associated with other aspects of CRISPR biology but never guide RNA production. We showed for some CRISPR-Cas systems that this region interacted with the front end of the CRISPR array, promoting subsequent processing steps. As a result, the guide RNA targeting the invader most recently encountered by the cells is prioritized for defense, ensuring that the systems are primed against invaders that might reappear or could still be lurking in the environment.

Exploring other aspects of CRISPR biology beyond the CRISPR arrays has also proven fruitful. For instance, we discovered a set of novel CRISPR nucleases dubbed Cas12a2 that look for RNA targets and, upon finding their target, begin degrading virtually any nucleic acid they encounter. This activity extends to double-stranded DNA, the information storage material of cells and many invaders alike. We also discovered two clades of nucleases most closely related to Cas12a2, with one (Cas12a3) exhibiting RNA-triggered cleavage of tRNA tails. These nucleases represent the first examples in the CRISPR family in which the target-dependent enzymatic activity of the nuclease is directed away from the target to enact the immune response.

In a separate example from CRISPR biology, we exploited the discovery that the tracrRNA, a processing factor necessary to go from Cas9 CRISPR arrays to guide RNAs, could convert cellular RNAs into guide RNAs for use by Cas9. After engineering this process, we were able to achieve a technological first: recording selected cellular transcripts in single cells. This technology allows us to peer into a cell’s past while tying it to its present state. We also applied the concept to tracrRNA-dependent Cas12 nucleases that, upon target DNA recognition, collaterally cleave single-stranded DNA. This approach allowed us to harness these DNA-targeting nucleases for direct RNA detection, relying on collateral cleavage for signal amplification.

Laying the foundation for CRISPR array design, we developed a tool to predict targeting activity based on the guide RNA sequence. While many such tools exist, few have focused on using CRISPR to silence genes in bacteria. We applied machine learning with published datasets to devise an algorithm for predicting “good” guide sequences and “bad” guide sequences. We also explored how to make CRISPR arrays used by Cas9 more compact, finding that arrays can be shortened. In some cases, shortening the array even improved performance. We also used the arrays in other contexts, such as the first sRNA screens in bacteria using the gut microbe Bacteroides thetaiotaomicron as a model.

Finally, we advanced a simple system for characterizing CRISPR biology and technologies: cell-free transcription-translation (TXTL). Using. TXTL, we established new approaches for characterizing CRISPR-Cas systems involving multiple components. We also found that re-optimizing TXTL preparation allowed us to begin using linear DNA. This step makes it easier to go from designed DNA sequence to experimental testing, accelerating our ability to perform experiments.

The work to-date has been replete with efforts to push the state of the art, and we have plans for the next half of the Consolidator project.

The work on the leader upstream of CRISPR arrays opened new opportunities to explore how this region impacts other CRISPR-Cas systems. We expect this phenomenon to extend to more swaths of CRISPR biology while revealing new variations on the theme.

The work on Cas12a2 nucleases will next explore the natural diversity of these nucleases. Their diversity appears to be far greater than we initially reported, and we expect to reveal new biochemical properties that could expand the application space of these nucleases. These nucleases also co-occur with Cas12a nucleases next to individual CRISPR arrays, where we expect to uncover how these nucleases utilize individual arrays to combat targeted invaders. We will also begin exploring in vitro and cellular applications of these nucleases, such as their use for molecular diagnostics or programmable cell killing, where the latter became the basis of a funded ERC proof-of-concept grant.

We also will delve into the stability of CRISPR arrays. Our expectation is that we will identify cellular factors responsible for their stability, where such factors can be expressed in other organisms to boost the overall performance of CRISPR arrays.

Finally, we will continue applying TXTL to interrogate CRISPR arrays and other aspects of CRISPR biology. Its use will aid many of the experimental efforts described above. Our expectation is that TXTL will become more commonly used by the research community to accelerate the pace of scientific discovery.

Project logo

Periodic Reporting for period 3 - CRISPRcombo (Interrogating native CRISPR arrays to achieve scalable combinatorial screens and dissect genetic redundancy)

Share this page Share this page on social networks

Download Download the content of the page