In the decades immediately following the Watson, Crick, (and Franklin) elucidation of the DNA structure and the formulation of the Central Dogma—that DNA genetic information gets transcribed to RNA messages which in turn are translated into polypeptides—molecular biology seemed to have divulged its secrets and really, on the whole, to be elegantly simple. DNA, double stranded and antiparallel as a way for each strand to act as a template for semi-conservative replication, was the informational storehouse. RNA, transcribed from distinct genes within the DNA, carried messages, and the RNA/protein structure of the ribosome translated these mRNA messages into proteins, the structural and enzymatic building blocks which constitute or synthesize other cell constituents.
The Central Dogma….kind of
Of course, this didn’t address how gene expression was regulated. Over the succeeding decades after the development of the Central Dogma, increasingly complex mechanisms have been identified allowing individual cells to regulate their gene expression for cell-specific developmental and environmental response needs. In humans, one particularly important mechanism is related to the methylation of DNA. Yes, it’s true; the four nitrogenous bases—adenine, guanine, thymine, and cytosine, which we learned as the inviolate structures at the basis of Watson-Crick pairing—aren’t actually inviolate. Under some conditions, specific methyl (-CH3) groups can be added to bases within a DNA double helix, and these can influence the overall structure of the surrounding chromatin.
In humans this primarily occurs on cytosine residues when they occur immediately 5′ (“upstream”) to a guanosine residue, or what is usually referred to as a CpG dinucleotide. (The “p” stands for the intervening phosphate group). Clusters of these CpG repeats, known as CpG islands, occur scattered throughout the human genome. Methylation of even a relatively small percentage of CpG dinucleotides in a region can lead to chromatin restructuring, which tends to condense or “close up” the nearby DNA sequence. In this condensed format, the DNA is less accessible to the enzymes required for promotor recognition and transcription, leading to what is referred to as silencing—in effect, suppressing or completely shutting off expression of genes in the surrounding area. Such methylation-driven silencing is at the root, for instance, of X-chromosome inactivation, the dosage compensation activity which occurs in female autosomal cells whereby one X chromosome is shut down, giving male (XY) and female (XX) cells effectively equal expression bases for X-linked genes.
This is of course of interest in and of itself, but it has also become increasingly apparent that inappropriate CpG methylation-driven silencing is an important aspect of cellular deregulation in some cancers. In one example, inappropriate methylation of the BRCA1 gene has been found in association with ovarian cancer, and this methylation correlates with responsiveness to specific chemotherapy agents.1 In this and a widening spectrum of other cancer types, it’s therefore desirable to interrogate the methylation status of particular genes in determining optimal therapy regimens. A number of commercial systems are now available to address this question; while they differ in respect to their platform and final method of result readout, most rely on a common underlying molecular technology to make the actual determination of whether a given CpG is methylated. Our focus for this month’s article is on the mechanism of this core technology.
The essence of this method is in a chemical treatment of purified DNA with bisulphite ions (HSO3–). When performed under appropriate (and fairly narrow) time and concentration conditions, this has the convenient effect of converting normal, unmethylated cytosine residues to uracil residues, while leaving 5-methylcytosine (our methylated form of interest) unchanged.
The details of the reaction are somewhat beyond the scope of this article; essentially, normal cytosine has a C=C double bond which is sensitive to addition of a sulfate group from the bisulphite ion; this sulfate group weakens the exocyclic amine group and leads to its hydrolytic loss, thereby replacing the –NH2 with =O, which defines the difference between cytosine and uracil bases. In 5-methylcytosine as it exists in a methylated CpG dinucleotide, this vulnerable double bond does not exist.
For our purposes, the effect of bisulphite treatment on a methylated versus unmethylated DNA sequence can be summarized in Figure 1.
The bisulphite treatment has therefore (at least on paper) neatly distinguished the two forms—methylated and unmethylated—of our target sequences of interest, in the form of differing DNA template molecules suitable for a range of downstream analytical methods capable of determining which template we have. Let’s consider what some of those downstream methods are.
Methyl-specific qPCR. In this approach, we would design a PCR primer which may only amplify from either the C (originally unmethylated) or U (originally methylated) DNA. Most commonly this is done by placing the variant nucleotide at the final 3′ position of the primer, making this a “normal” allele-specific PCR reaction; a non-proofreading DNA polymerase can only extend from the primer to make a successful PCR amplicon if this critical 3′ nucleotide is paired to its template strand. For this purpose we’d ideally do this as a paired set of reactions: one each with the C or U-specific primer, sharing a common primer for the other end of the PCR product, and detected by a real-time methodology. The purpose of the real-time method here is to allow for relative quantitation of the methylated versus unmethylated forms, because, as the reader might surmise, methylation of a particular genetic sequence might not occur equally across all cells and the most useful measurement here will be off a population of cells, with a measure of % methylation. (Classical PCR here would be expected to just yield qualitative positive signals for both the methylated and unmethylated sequence forms regardless of whether one form largely predominates or not; a rather uninformative answer.)
Direct sequencing. In this version of the approach, following bisulphite modification, we’d use PCR with primers flanking the methylation site of interest to generate sufficient amounts of template for sequencing by either Sanger methods or pyrosequencing. Our sole interest in this case is the identity—C or U—of a very specific nucleotide (or sometimes a few within a fairly small region), making either of these approaches very straightforward here. During the PCR replication, the base across from the nucleotide of interest will either be replicated as a G (pairing to C) or A (pairing to U), and thus readily distinguishable in the sequence result. Both Sanger and pyrophosphate sequencing methods are capable of reporting at least semi-quantitatively on the ratio between the C and U forms at template nucleotide(s) of interest, providing the same quantitative value for methylation percentage as discussed above in the context of methyl-specific PCR. While more complex in terms of workflow and equipment required than the methyl-specific PCR approach, direct sequencing has the ability to report on the methylation status of each of several multiple closely associated CpG dinucleotides (any that fall within the tested amplicon), as opposed to the single site queried by methyl-specific PCR.
Array capture. A third approach is to capture the post-treatment DNA on an array (in any of its possible forms, such as solid phase or bead-based) where there are array elements specific for the hybridization of either of the two possible sequence fragments (as above, C containing indicating unmethylated original material, and U containing indicating methylation). While this is only a single nucleotide difference between the two forms, careful selection of hybridization conditions and quantitative analysis of the hybridized signal for each of the two array elements provides data on which to assess the relative percentage of methylation as present in the starting material.
Mass spectrometry. The post-treatment material can be amplified by flanking PCR primers much as in the sequencing-based approaches, but then applied to a mass spectrometer, which can readily distinguish the mass difference between the C and U containing sequence products, and their relative ratios. (The reader may be tempted to ask why then can’t one just run mass spec on the unmodified DNA, as the methyl group itself is an effective mass tag. The answer is one could, if enough of a uniform DNA fragment containing the site of interest were available to feed into the mass spectrometer; however, generally this is not the case, so bisulphite treatment and PCR is needed to generate this uniform flanking fragment). While mass spectrometry can provide ratios on the methylated to unmethylated signals, unlike sequencing it does not provide this on a by-position basis for an amplicon containing more than one CpG island—and if there are multiple possible methylation sites within a test amplicon, the mass spec signal may yield a “ladder” of mass signals separated by one C-U mass unit difference. This approach is easiest when performed on a single possible CpG site, such as methyl-specific PCR.
Implementations on each of these methods, all hinging on bisulphite modification to separate methycytosine from cytosine, are available and in use for research and increasingly, clinical applications to oncology. Other less frequently encountered methods also based on bisulphite treatment exist, including high resolution melting (HRM). HRM analysis may distinguish the C-G pair containing, non-methylated end product from the U-A-containing end product from a methylated starting material. This approach, however, has challenges in that clear relative percentage of the two species are not provided. Still other methods omit bisulphite treatment entirely, such as single-stranded conformational analysis. For now, however, the majority of DNA methylation assays that clinicians interacting with their local molecular translational research lab will encounter are all based around this core approach.
- Veeck J, Ropero S, Setien F. BRCA1 CpG island hypermethylation predicts sensitivity to poly(adenosine diphosphate)- ribose polymerase inhibitors. JCO. 2010; 28(29):e563-e56