Skip to main content
European Commission logo
English English
CORDIS - EU research results
CORDIS
CORDIS Web 30th anniversary CORDIS Web 30th anniversary

Interrogating native CRISPR arrays to achieve scalable combinatorial screens and dissect genetic redundancy

Periodic Reporting for period 2 - CRISPRcombo (Interrogating native CRISPR arrays to achieve scalable combinatorial screens and dissect genetic redundancy)

Reporting period: 2021-12-01 to 2023-05-31

Biology is full of redundancy. Whether in development, cellular signaling, metabolism or host-pathogen interactions, many genes appear to perform similar functions, and removing one gene seems to have little effect on the associated cellular process. However, emerging evidence suggests that these genes work together, where disrupting combinations of these genes can begin to reveal their contributions to cellular function. This Consolidator project aims to establish new means of unraveling biological redundancy using a less-utilized part of CRISPR technologies: CRISPR arrays. Best known for genome editing, CRISPR technologies are derived from bacterial immune systems with CRISPR arrays at their core. CRISPR arrays encode the guide RNAs that direct the CRISPR machinery to recognize different viral RNA and DNA sequences, posing the immune system to clear different potential invaders. The order of the arrays is determined by the order of infections, with guide RNAs against recent invaders on one end of the array and guide RNAs against invaders from long ago on the other end. These same arrays offer a compact means to target multiple redundant genes at once, allowing researchers to apply high-throughput combinatorial screens to identify gene sets that contribute to a cellular process or function. The challenge though is determining how to best implement CRISPR arrays, which begins with better understanding their biology and evolution.

The goal of this ERC Consolidator project is to elucidate and apply the properties of CRISPR arrays toward untangling redundancy in biology. The proposed work was broken into three distinct objectives centered around elucidating design rules for CRISPR arrays, understanding why CRISPR arrays do not undergo rearrangements despite their incredibly repetitiveness, and applying insights to interrogate one example of redundancy within gene regulation by bacterial small RNAs. In turn, this project is expected to deepen our understanding of an important yet understudied aspect of CRISPR biology and lay a foundation to interrogate redundancy in biology. Given the breadth of examples of biological redundancy, those efforts in turn could have wide-ranging impacts, from cancer treatment to antibiotic development and devising new means to treat genetic disease.
The work to-date has focused on delving into the natural features of CRISPR arrays and their accompanying CRISPR biology as well as laying a foundation to apply CRISPR arrays to interrogating biological redundancy.

Within the natural features of CRISPR arrays, we discovered that the region upstream of CRISPR arrays contributes to the production of the encoded guide RNAs. This region, called the leader, was associated with other aspects of CRISPR biology but never guide RNA production. We showed for some CRISPR-Cas systems that this region interacted with the front end of the CRISPR array, promoting subsequent processing steps. As a result, the guide RNA targeting the invader most recently encountered by the cells is prioritized for defense, ensuring that the systems are primed against invaders that might reappear or could still be lurking in the environment.

Exploring other aspects of CRISPR biology beyond the CRISPR arrays has also proven fruitful. For instance, we discovered a set of novel CRISPR nucleases unlike any other known nucleases. These nucleases, which we have dubbed Cas12a2, look for RNA targets and, upon finding their target, begin degrading virtually any nucleic acid they encounter. This activity extends to double-stranded DNA, the information storage material of cells and many invaders alike. This process shuts down the infected cell, preventing the invader from spreading to other cells in the population.

In a separate example from CRISPR biology, we exploited the discovery that the tracrRNA, a processing factor necessary to go from Cas9 CRISPR arrays to guide RNAs, could convert cellular RNAs into guide RNAs for use by Cas9. After engineering this process, we were able to achieve a technological first: recording selected cellular transcripts in single cells. This technology allows us to peer into a cell’s past while tying it to its present state.

Laying the foundation for CRISPR array design, we have been developing a tool to predict targeting activity based on the guide RNA sequence. While many such tools exist, few have focused on using CRISPR to silence genes in bacteria. We applied machine learning with published datasets to devise an algorithm for predicting “good” guide sequences and “bad” guide sequences. We also explored how to make CRISPR arrays used by Cas9 more compact, finding that arrays can be shortened. In some cases, shortening the array even improved performance.

Finally, we have been advancing a simple system for characterizing CRISPR biology and technologies: cell-free transcription-translation (TXTL). TXTL can be created with specially prepared innards of bacterial cells, allowing us to go from DNA to RNA to protein without working with live cells or going through time-consuming protein and RNA purifications. Using. TXTL, we were able to establish new approaches for characterizing CRISPR-Cas systems involving multiple components. We also found that re-optimizing TXTL preparation allowed us to begin using linear DNA. This step makes it easier to go from designed DNA sequence to experimental testing, accelerating our ability to perform experiments.
The work to-date has been replete with efforts to push the state of the art, and we have plans for the next half of the Consolidator project.

The work on the leader upstream of CRISPR arrays opened new opportunities to explore how this region impacts other CRISPR-Cas systems. We expect this phenomenon to extend to more swaths of CRISPR biology while revealing new variations on the theme. We also expect to incorporate the leader as part of CRISPR array design that has not been considered.

The work on Cas12a2 nucleases will next explore the natural diversity of these nucleases. Their diversity appears to be far greater than we initially reported, and we expect to reveal new biochemical properties that could expand the application space of these nucleases. These nucleases also co-occur with Cas12a nucleases next to individual CRISPR arrays, where we expect to uncover how these nucleases utilize individual arrays to combat targeted invaders. We will also begin exploring in vitro and cellular applications of these nucleases, such as their use for molecular diagnostics.

Building on our ability to predict “good” guides and create compact CRISPR arrays, we will continue working toward the predictive design of CRISPR arrays. Our expectation is that we will create design tools that account for not only the guide sequence but also where it appears in a CRISPR array. We will also apply these arrays to disentangle redundancy in small RNA networks in bacteria, with the expectation of identifying core sets of small RNAs that contribute to different cellular processes. This example will become a starting point for others to interrogate biological redundancy and extract fundamental principles.

We also will delve into the stability of CRISPR arrays. Our expectation is that we will identify cellular factors responsible for their stability, where such factors can be expressed in other organisms to boost the overall performance of CRISPR arrays.

Finally, we will continue applying TXTL to interrogate CRISPR arrays and other aspects of CRISPR biology. Its use will aid many of the experimental efforts described above. Our expectation is that TXTL will become more commonly used by the research community to accelerate the pace of scientific discovery.