Immune repertoire analysis is gaining interest as a clinical NGS application

By: John Brunstein   

When most laboratorians think of next generation sequencing (NGS) techniques in a clinical setting, it’s usually in the context of examining a patient’s genome for particular disease susceptibility markers, profiling of cancers, or detecting exogenous sequences such as pathogens.

This month’s installment of The Primer will focus on another application which has been steadily gaining interest over the past half-decade: sequencing of the variable recombinant portions of B and T cell genomes, in a process known generally as immune repertoire analysis. Readers may also have come across one form of this under the name of “spectratyping.”

The technique stems from looking at the patient’s adaptive immune system. In a much simplified explanation, this is the system whereby developing B and T cells undergo reclombination at defined genetic regions known as V – (D) – J (Variable, Diversity, and Joining) regions to develop unique antibodies and T cell receptors (TCRs) respectively. Based on their unique peptide sequences, these can create specific binding affinity for “non-self” ligands such as those present on pathogens or on transplanted organs. While such reassortment is essentially random in nature and thus constantly samples a wide range of “sequence space” or potential binding surfaces, those individual B and T cells whose recombinant markers find a “non-self” match are positively selected for and undergo clonal expansion as the basis for cell-mediated immunity.

These V – (D) – J regions are relatively short, in the range of 500 bp each, and flanked by well conserved sequences. This combination makes the raw data collection particularly amenable to NGS techniques, which provide massive numbers of short parallel sequence reads. While this method has been applied to specific organ samples, it more generally employs readily collected peripheral whole blood from which the white cell fraction, containing the B and T cells, is separated and subject to nucleic acid extraction. This extract is then PCR-amplified by a defined primer set directed against the conserved flanking regions. The design of this primer set is non-trivial, as it should be designed for (and have been experimentally validated as) having minimal biasing; that is, it should not artificially increase the apparent frequency of some V – (D) – J recombinants at the expense of others, as that will skew the data analysis.

While absolute “impartiality” of immunorepertoire amplification primer sets is something of an impossibility, some published primer sets such as Biomed-2 have been widely validated, and use of a consistent set such as this allows for comparison of results between samples. Note that it’s also possible to perform this analysis using extracted mRNAs, as opposed to genomic DNA, as the target material; while gDNA-based methods are believed to have higher sensitivity, mRNA-based approaches can take advantage of additional techniques (beyond the scope of this article) to avoid PCR bias.

In either case, the resulting sequence data is analyzed through application of bioinformatics. The exact statistical and control approaches underlying the analysis are complex, and differ in process details between completing published and proprietary approaches (no less than 11 of which are named in reference 1 at the end of this article just in conjunction with the spectratyping approach to immune repertoire analysis). While differing in mathematical detail, all of the methods seek to provide what can be broken down to essentially two types of measurement: immune repertoire diversity, and the presence or abundance of particular clonally expanded B or T cell types. Each of these measurements or markers can provide a different sort of insight to the underlying patient health, so let’s consider them in turn.


The diversity of a person’s immune repertoire is a measure of how many different B and T cell types are present at a “snapshot” in time when the sample was taken. The larger this diversity, the better the likelihood of a pathogen being effectively bound by a B or T cell, allowing for subsequent clonal expansion and a targeted immune response. Based on the numbers of possible V, D, and J sequences present in the human genome, pure mathematical models suggest that more than 1×10^11 different combinations are possible. The actual biology is a bit more restricted, however, with only certain types of V – (D) – J recombinations actually found to occur. As an example, for just T cell receptor (TCR) types, a study by Dare2 suggested that young, healthy adults have between 40,000 and 100,000 simultaneously circulating TCR types at any given time. This number is observed to decrease with age, by as much as an order of magnitude. B cell diversity may be observed to exceed that of T cells,3 possibly due to mechanisms of somatic hypermutation and affinity maturation whereby B cells can undergo additional genetic changes in response to a ligand binding stimulus.

As the reader might guess, a measure of this diversity in a patient is a measure of the robustness of his or her cell-mediated immunity systems. Considered simplistically, if the diversity of available B and T cell receptors is low, there’s less chance that an incoming previously unencountered antigen will be recognized. While finding low immune repertoire diversity doesn’t indicate what the underlying cause is, it provides a warning of apparent immune suppression which can be helpful in making a diagnosis. When the cause is known or suspected and clinical intervention is occurring, ongoing monitoring of immune repertoire diversity can help to monitor therapy. Examples where this has been clinically applied include measuring restoration of diversity following successful high-HAART in HIV patients, and assessment of organ graft tolerance versus rejection.

Clonal expansion

As described above, clonal expansion in a B or T cell type occurs after it has a productive binding interaction with a non-self ligand. In the context of an NGS approach, particular B or T cell V – (D) – J sequences which are expanded will appear as multiple reads of the same sequence. Determining whether multiple identical sequence reads are stochastic (that is, different B or T cells just happened to have the same recombined sequence) or are actually due to a clonal expansion in response to stimulus is a question for statistics; one rule of thumb as proposed in reference 3 suggests a minimum of 5x the read frequency of the “average level” of the most common V – (D) – J sequences in “healthy” baseline controls as a metric. While definitions such as this are of necessity somewhat vague, they do illustrate that in cases of actual clonal expansion, the expanded sequence should occur at a quite obviously higher rate than other sequences in the study.

Monitoring for clonal expansion through NGS has been applied tangentially to clinical use in the assessment of vaccine effectiveness, and in direct clinical use for monitoring lymphomas. In this case, identification of a single clonotype as associated with the cancer allows for quantitative measurement of its reduction in response to treatment, ideally to undetectable levels. Known as Minimal Residual Disease (MRD) analysis, NGS applications to this have been successfully applied in chronic lymphocytic leukemia (CLL) and acute lymphocytic leukemia (ALL) with sensitivities capable of detecting as little as a single tumor cell per million nucleated cells in peripheral blood.

The same technology has also been applied in tracking the levels of introduced T cells (adoptive T cell transfers), where a population of several T cell clones with demonstrated efficacy against a cancer are ex vivo expanded and then added back to the patient in an effort to halt tumor progression. In this case, decreases in the population of the desirable clones could serve as a metric for fresh addition. A third emerging application of tracking clonal expansion is in the context of autoimmune diseases, where research is attempting to identify particular V – (D) – J recombinants associated with autoimmune diseases with the plan of monitoring their levels in response to therapeutic options. Conceivably, if particular sequences are found to associate with particular autoimmune conditions, this could pave the way for earlier (pre-symptomatic) diagnosis, or even for some form of targeted suppression of the problematic clones.

Confounding issues

Both the diversity measurement and assessment of individual clonal expansions in immune repertoire can be confounded by technological pitfalls and quirks of biology. On the diversity side, a major concern arises through the potential for NGS error rates to lead to an overestimation of diversity; that is, some of the variant sequences reported and used for calculation really just arise through PCR or sequence read errors. Increases in accuracy of NGS methods, as well as methodological approaches including incorporation of spiked control sequences monitored for “misread” rates, can help to minimize this source of error. On the clonal expansion side, one example of a potentially misleading result might be in the case of benign monoclonal gammopathy—a long known and not uncommon “condition” of (as the name suggests) little clinical importance, where for unknown reasons one particular B cell clone becomes highly overrepresented. Suitable bioinformatics as well as incorporation of the entire clinical picture are tools to employ in separating these confounding effects from more useful information available through immune repertoire analysis.

For now, costs, speeds, and limited availability of NGS services to clinical standards limit the widespread adoption of this as part of the diagnostic arsenal of most molecular laboratorians. As these barriers to wider adoption fall, it is to be expected that immune repertoire profiling should continue its move from the research and special application settings and start to become a more normal component of assessing health status and effectiveness of therapies influencing the immune system.


  1. Six A, Mariotti-Ferrandiz ME, Chaara W, et al. The past, present, and future of immune repertoire biology – the rise of next-generation repertoire analysis. Frontiers in Immunology. 2013;4:413. doi:10.3389/fimmu.2013.00413.
  2. Dare R, Sykes PJ, Morley AA, Brisco MJ. Effect of age on the repertoire of cytotoxic memory (CD8+CD45RO+) T cells in peripheral blood: The use of rearranged T cell receptor gamma genes as clonal markers. Journal of Immunological Methods. 2006;308 (1–2): 1–12. doi:10.1016/j.jim.2005.08.016. PMID 16325196.
  3. Weinberger J, Jimenez-Heredia R, Schaller S, et al. Immune repertoire profiling reveals that clonally expanded b and t cells infiltrating diseased human kidneys can also be tracked in blood. PLoS ONE. 2015;10(11): e0143125. doi:10.1371/journal.pone.0143125.



John Brunstein, PhD, is a member of the MLO Editorial Advisory Board. He serves as President and Chief Science Officer for British Columbia-based PathoID, Inc., which provides consulting for development and validation of molecular assays.

Immune repertoire analysis is gaining interest as a clinical NGS application
John Brunstein
John Brunstein, PhD, is a member of the MLO Editorial Advisory Board. He serves as President and Chief Science Officer for British Columbia-based PathoID, Inc., which provides consulting for development and validation of molecular assays