Back to Basics: Array diagnostics

Oct. 24, 2017

In this month’s column, we are going to continue with our “Back to Basics” theme by reviewing what underlies a common molecular diagnostics (MDx) laboratory method, microarray-based diagnostics. We will also take the opportunity to see how its use has changed in the few years since it was last covered in this space.

What’s an array, anyway?

First, let’s remind ourselves what an “array” is in this context. It’s most accurately described as a spatially distinguishable set of interrogatable probes for specific short nucleic acid targets. If that seems like a rather meaningless juxtaposition of words, you’ve come to the right place: read on. The most traditional format for a microarray is a small silica (glass) piece or “chip,” perhaps about the size of a small postage stamp, held in a defined orientation in some sort of carrier. This chip provides a piece of spatially referenced real estate, divided into a grid of rows and columns; within each referenced location, many identical copies of a user-defined nucleic acid oligonucleotide are tethered down at one end via a linker molecule so that they project up, rather like tiny hairs.

Each of these oligonucleotides is thus free to hybridize to its complementary target sequence, assuming something along the lines of a Southern blot is performed. That is, the chip surface is immersed in a suitable buffer at an appropriate annealing temperature for the hybridization reactions in question, and thermodynamics is allowed to assert its authority.

This drives hybridization between any in-solution nucleic acid strands which are the complement (or at least close match) to tethered probes. It is then followed by a few rounds of washing to remove any extraneous weak binding nonspecific interactions, and the result is an array chip where any grid spots which had a matching nucleic acid molecule in solution have captured and localized this to a unique, known grid address.

Probes can have variety, too

Let’s pause for a moment there to consider some of the potentially useful variations we might do on the chip-bound oligo side. Above, we only referred to the spatially fixed items as oligonucleotides. The exact chemical nature of these oligonucleotides is up to us at time of chip manufacture, and while they are commonly made of “garden variety” DNA, we can employ tools such as degeneracy (that is, a mix of more than one nucleotide at a position in a probe sequence, allowing for perfect match to more than one sequence variant at that nucleotide position) or non-canonical bases such as inosine (again, allowing for controlled degeneracy in hybridization matching). Other useful tools might be the kuse of peptide nucleic acid (PNA) or locked nucleic acids (LNA) as probe components, as these provide for stronger (more specific) target binding than purely natural bases.

What can be spotted down as the captive probe at each grid point is open to a great deal of imagination. One factor that tends to limit wild flights of fancy is the fact that really short probes don’t work very well; mathematically, they just don’t have much sequence specificity, and they require awkwardly low temperatures for hybridization and washing. Really long probes also don’t work well; they have increased likelihood of binding to partial matches, and they can start to have physical steric hindrance or homodimer interactions, such as hairpin formation, that make them poorly available to interact with sample in the liquid phase. In addition, if we expect to use the array at a single hybridization and wash temperature for all targets, then within a certain small window of variation (probably less than 1°C) all probes should have matching annealing temperatures.

Spot detection

The next thing to contemplate is how to detect which array grid spots have bound to targets from the liquid sample they were immersed in. The most common methods here are photonic- (optical-) based, and are most easily achieved if we pretreat the liquid test sample so as to add some form of fluorescent label to all of the nucleic acids it contains. Using this method, our array readout methods are straightforward digital image capture of the array area, and spatial detection and differentiation of the glowing spots which indicate captured, labeled target material. An inherently helpful aspect of this approach is that optical readout resolving power permits very close spacing of individual array grid spots, or, put another way, very high spot density.

Fluorescence detection is also amenable to limited multiplexing, meaning that we can differentially label multiple (usually, two) samples and detect them independently on a single array. For these most traditional silica microarrays with fluorescent readout, the number of distinct grid spots (probes) per chip area is limited by mechanical aspects in the chip production process, not readout resolution. If we want to ask how many indexed spots or grid reference points can we fit on a microarray of this type, it gets a bit into how the array is made. The simplest method mechanically spots tiny droplets of the desired pre-made full length probes at their intended grid points, and these chemically adhere; in this method, the density is limited by the mechanical step size of the spotting or “printing” instrument (and in placing the tiny spots far enough apart that they don’t bleed to each other and intermix during printing). A second approach uses photolithography to define and chemically activate array grid spots for in-situ synthesis of desired oligonucleotide probes right on the silica surface; as this is optically driven rather than a purely mechanical approach, it’s at least theoretically capable of higher grid densities than direct spotting. In reality, the end user probably has little concern about which method was used; suffice it to say methods exist to reliably create two-dimensional silica chips with well over a million discrete spots or “features” present.

(As an aside, now that we have a feel for the number of features we could have on a microarray, it starts to become apparent that while we could introduce things like degeneracy within a single spot, it probably makes more sense to just have two or more spots as needed to represent each sequence variation uniquely; then we can actually identify which of the possible sequence forms is present, rather than lumping them together. It’s up to the array designer to decide, though, demonstrating the sort of flexibility one can have with microarray methods.)

If traditional microarrays are fixed oligonucleotide spots on silica wafers with spatial indexing and fluorescent detection of target capture, what are some of the variations on this? While space limitations restrict us from going into all of the other microarray formats and approaches possible, it’s worth mentioning at least one other common format. This is the fluid-phase bead array approach, where rather than attaching oligonucleotide probes to a flat silica surface, we attach them to differentiable microscopic beads. Different bead types can be told apart either by color code, or actual tiny monochrome barcode-like markings; each bead type is then coupled to a single probe.

These types of arrays are also generally read out by optical methods based on fluorescence, but tend to be limited to a few hundred features at most (it becomes hard to differentiate many more bead types than that). While that’s a disadvantage compared to 2D silica arrays for feature density, liquid phase hybridization kinetics can make bead type arrays faster than their competitors. It’s also possible to rapidly customize a bead-based array by adding or removing one bead type with its probe, while 2D silica arrays, once printed, are fixed. On the detection side, one variation is in use of electrochemical methods for spot readout rather than fluorescence. This approach is used in some clinical service array-based devices, but a caveat here is that limitations to detection spatial resolution by this method mean these forms of 2D arrays have very low feature densities.

Common types of array assays

So we’ve reminded ourselves of what the common forms of a microarray are, and how they’re read out; what is it that we can do with them, and has that changed (or its practical utility changed) in the past few years? First, let’s summarize the list of some of the most common microarray applications:

Expression arrays. These work by collecting and labelling expressed mRNAs in a sample, and then hybridizing to an array with probes for various genes of interest. Probes can be specific for individual isoforms or splice variants; data obtained is not just presence or absence of particular mRNAs, but also relative abundance.

Array CGH. As covered in detail in the June 2014 installment of this column (“Array CGH: mechanisms and applications,” https://www.mlo-online.com/array-cgh-mechanism-and-applications.php) this technique in a nutshell differentially labels whole genome DNA from a “control” source and a “sample” source, then attempts to hybridize for markers evenly distributed across the genome. Competition for hybridization between sample and control means that duplications and deletions in the sample are readily detected by this method.

Resequencing arrays. These arrays represent selected, limited regions of the genome in a series of oligonucleotides which both “tile” (overlap in sequence coverage) and collectively represent possible sequence variations. By measuring which of these possible sequence versions hybridize to the sample, the sample sequence from the region of interest, such as the whole ~16 kb mitochondrial genome, is read out.

SNP arrays. These interrogate large numbers of (ideally) uniformly, randomly distributed single nucleotide polymorphisms (SNPs) across the genome. These are helpful in detecting issues such as loss of heterozygosity (LOH; for example uniparental disomy of a chromosome).

Use as a detection method for highly multiplexed PCR assays. Conventional real-time PCR systems can multiplex a handful of targets—possibly up to as many as six, although three or four are more frequently feasible—but imagine being able to set up a PCR reaction for the detection of possibly hundreds of targets at once. Microarrays and, in particular, smaller ones such as the liquid phase types described above, provide an excellent approach for detecting which of the possible reaction products are formed in such a test. Note that since this is an endpoint PCR detection, it provided qualitative data only, but such may be of use, for example, in infectious disease settings where any detection is diagnostic.

In general, this summary list of what we can do with microarrays hasn’t really changed in the past five years or so. Their practical utility in some contexts, however, has changed, primarily in those applications where arrays were (are) used to screen large amounts of genetic information such as whole genome expression studies or array CGH. When microarrays first started becoming popular in clinical applications, they represented the most cost-effective approach to genome-wide measurements of a range of selected targets. The biggest change in that over the past few years has been the steady declines in cost and technical difficulty for next generation sequencing (NGS), and the increasing accuracy and throughput of those methods.

For labs currently equipped with microarray instrumentation and with established operational workflows for sample processing and data interpretation, microarray methods will likely remain competitive for some years to come. For a lab just looking now to establish tools for genome- wide/high throughput analyses, consideration of NGS as an alternate platform is warranted, however, as it may be more flexible or cost-effective, depending on intended application. As NGS systems continue to become cheaper and easier, they are likely to further become the method of choice for these sorts of studies. Until then, however, the molecular laboratorian is likely to see both methods in use and of practical utility.

John Brunstein, PhD, is a member of the MLO Editorial Advisory Board. He serves as President and Chief Science Officer for British Columbia-based PathoID, Inc., which provides consulting for development and validation of molecular assays.