One of the Holy Grails of molecular processes—either for clinical utility, or to serve as the premise for science fiction plots—has been the quest for genetic engineering, that is, the specific alteration of selected genome nucleotide(s) in a target organism easily and with high efficiency. There is a high level of interest in this topic among the general public (and a lot of misunderstanding, too, but that’s another article).
“But wait!” you say. “Haven’t we been cloning genes and so on for years? Isn’t this the sort of thing that undergraduates do in first-year labs now?” Well, yes it is, but that sort of genetic engineering is restricted to single-cell organisms (bacteria and yeast), and while there are many simple tools for cloning something into one of these systems at a unique location with high efficiency, the choices for where to make these modifications is limited. That is, even in those systems, until recently one couldn’t just choose any specific genetic region at will to modify.
Efforts at gene therapy in humans have similarly been restricted, primarily to conditions where the underlying problem is something—usually a non-functional gene copy—whose function can be replaced in trans from some other location. Imagine, though, if there were a tool that allowed for correcting or modifying a non-functional gene in situ, meaning that all of its innate control regions remain active. That would also avoid any risk of possible insertional inactivation such as can occur with viral vectors (that is, where they insert at some unselected genetic location and by doing so disrupt a critical gene).
In this month’s Primer, we’re going to give an overview of a system, now a few years old and well developed, which allows for just this at-will selection of any unique genetic locus and its modification. This system is Clustered Regularly Interspaced Short Palindromic Repeats, or CRISPR.
The restriction enzyme
In the simple forms of genetic engineering referred to above, a major tool is the restriction enzyme. This is a bacterially derived enzyme which recognizes a fixed, usually palindromic, short (~4-8 bp) DNA sequence and provides a pair of cuts in the sugar-phosphate backbone. These cuts can then be used as sites to introduce (“splice in”) exogenous DNA with compatible ends, with DNA ligase to reform the covalent backbone bonds. Of course, based on purely statistical considerations, such short recognition sequences must occur multiple times in a large genome, and we are stuck with the inherent sequence specificity of the enzyme. The sequences are believed to act as a form of bacterial immune system, paired with enzymes that protect or mask the bacteria’s own DNA but leave invading viral (bacteriophage) DNA susceptible to multiple cleavage events rendering the virus inactive.
The CRISPR system is also believed to act as a bacterial immune system, but with a higher degree of complexity and, indeed, an ability to adapt to new viral challenges. In this regard it’s conceptually analogous to the antibody system we have, but with the added benefit that a bacteria exposed to a new virus and developing a CRISPR-based defense passes this resistance on to its progeny.
In effect, CRISPR also acts like a restriction enzyme except that, rather than the DNA target sequence to be cleaved being determined by a fixed amino acid sequence of the enzyme, CRISPR binds to an interchangeable ~20 base RNA known as a crRNA (which is homologous to the DNA target sequence to be cut) and a second activating RNA known as a tracrRNA. It’s this homology to target being coded for by the crRNA that allows the host bacteria to use this as an adaptive immune system; by capturing short segments of viral sequence in the genetic element which codes for crRNAs, new target specificities (crRNAs) are developed without having to change the amino acid sequence of the CRISPR protein itself. Equipped with the crRNA and tracrRNA, CRISPR is able to selectively bind to the sequences identified by the crRNA. While the CRISPR protein lacks any direct endonucleolytic activity of its own, it acts via the tracrRNA to recruit in a nuclease activity (most commonly, in the form of an enzyme called Cas-9) which binds to the complex and makes cuts in the DNA strands, much like a classical restriction enzyme.
If you’re a bacterium defending yourself from an infecting virus, that’s the end of the story; with the virus chopped up and inactivated, you can go back to your busy metabolic life. For our purposes in using this as a selective genetic engineering tool, we’ll want to make a few changes, and add some subsequent steps.
Single guide RNA
The first change we’ll make, for sake of ease, is combining the crRNA and tracrRNA into a functional single molecule (called an sgRNA, or “single guide RNA”). This has been worked out, and with the bulk of the sgRNA predefined, choosing a target sequence specificity is almost as simple as adding in the ~20 base region of homology to an sgRNA template. Cloned into an expression vector, the sgRNA is now expressed. As it happens, this molecule can now both bind to an intended target via homology, and directly recruit in Cas-9; no CRISPR protein is needed to hold the crRNA and tracrRNAs together. Thus, if we simultaneously co-express Cas-9 and sgRNA in a cell, it will generate site-specific DNA cuts at our targeted locus.
You probably noticed the phrase “almost as simple” in the preceding paragraph. Yes, it’s true, not even the CRISPR system can actually just cut any sequence with no rules; it turns out that it must absolutely have a DNA motif called a PAM (protospacer adjacent motif) directly adjacent to the target region. Fortunately, a PAM is any element of the form “NGG,” where N is any nucleotide, and on average we can therefore expect a PAM motif about every 16 base pairs in any random DNA sequence. So while CRISPR is not strictly allowing us to cut any DNA sequence, it’s generally rare that we can’t get within a few nucleotides of where we want to cut.
This, then, relates to the second part of using CRISPR in doing gene editing. We’ve made a couple of cuts; now how do we put this back together (and ideally, with our intended alterations)? To do this, we take advantage of the cell’s natural DNA double-strand breakage repair machinery, which will try to fix any double stranded cuts of this type. One such mechanism—non-homologous end joining, or NHEJ—is error-prone and useful if our intent is to shut off a gene; it will act to close up the Cas-9 cuts but usually will introduce a frameshift, rendering the rejoined gene non-functional. In the second repair method—homology-directed repair or HDR—the cell looks for homologous DNA elements to the break region, and in effect “copies” these into the cut. If we provide the cell with an excess of a short DNA element with both homology to the cut region and our intended changes at the cut site, the cell will have a high probability of using this as its repair template, and we’ll have succeeded in making a site-specific modification of our target.
Note that with this HDR mechanism, the length of our “repair template” is variable, which in effect compensates for the need for a PAM sequence. While we may have had to move our cleavage site a short distance away from where we want the exact change, the HDR will allow us to engineer back down across the exact nucleotide(s) of interest.
To summarize, this system allows us to select (almost) any DNA target element of about 20bp; design an sgRNA and possibly an HDR element to direct the repair; and then use these to go into our cell of interest and make efficient, highly site-specific modifications. There are a number of variations and improvements on this (such as alternatives to Cas-9 with more desirable behavior), but the basic approach is the same.
This CRISPR-Cas9 system thus provides us with a very powerful tool with which to target our genetic editing—but how do we actually apply this to a large multicellular organism such as ourselves? We obviously can’t edit all the cells in a multicellular organism, and editing one or two somewhere in isolation would just be a form of somatic mutation—of impact on the cell or few cells changed, but not of widespread impact on the whole organism. This hurdle in gene therapy is, of course, by no means unique to the CRISPR approach, and it’s for this reason that diseases impacting, for example, bone marrow are most readily addressable. A sample of cells can be taken, modified in-vitro through an engineering method such as this, and then the selected reengineered cell(s) can be reintroduced into a suitably prepared host, where they can then clonally expand and provide the needed or repaired gene function. To date, CRISPR-based methods have been used successfully in some animal models, and the first human clinical trials (in contexts of sickle-cell anemia and thallasemia) are expected to begin in Europe some time this year.
While the CRISPR-Cas9 system is thus no magic wand for treating genetic diseases, it is a powerful tool in the genetic engineering kit and one which starts to bring ease of targeted manipulation of complex eukaryotic cells more in line with the tools we have for bacterial systems—those that we put in the hands of first-year undergrads. Combined with improved ways to selectively deliver these genetic tools to the critical cells of interest in a given disease, this method and its emerging derivatives will start to allow for direct treatment of an increasing spectrum of diseases. Laboratorians can expect to hear more of this system and its applications in the future.