DNA sequencing is the process of determining the order of nucleotides in a strand of DNA. Next-generation sequencing (NGS) represents the shift to newer technologies that have increased output to several million bases, enabling the assembly of an entire genome from one run. NGS is used for the sequencing of whole genomes, whole exomes, transcriptomes, and targeted gene regions.
Although commercially available NGS platforms use several different chemistries, the underlying workflow is similar for most platforms. NGS library preparation begins with the fragmentation of target DNA (genomic or cDNA). This is followed by ligation of the DNA fragments to oligonucleotides. The ligated fragments are amplified by PCR; ideally, all sequences are represented many times in overlapping PCR products. After sequencing, the reads from the library are aligned and assembled.
Although the efficiency of NGS has increased, obtaining good read depths consistently is a challenge. Digital PCR is a unique and valuable tool to improve the efficiency of the NGS workflow and for the verification of sequencing data.
Digital PCR can increase the accuracy of library quantification, enable accurate balancing of libraries, and provide validation of NGS findings. Bio-Rad's Droplet Digital™ PCR (ddPCR™) System is particularly well suited to being integrated into the NGS workflow to increase sequencing efficiency.
In the ddPCR System, each PCR sample is partitioned into a large number of droplets. PCR amplification occurs simultaneously in each droplet. At the end of the run, each droplet is individually assessed for the presence (positive) or absence (negative) of a fluorescent signal. Using a Poisson statistical analysis, the ratio of positive to negative droplets yields absolute quantification of the initial number of copies of the target sequence.
After DNA fragmentation, PCR is used to enrich the fragments that have adapters ligated onto both ends. Bio-Rad's ddPCR technology can be used for this amplification step. A potential advantage of digital PCR is that sequences that are more difficult to amplify, such as GC-rich regions or longer fragments, may have better representation in a library after partitioned amplification in droplets.
Unlike standard PCR, when the entire reaction is in a single vessel, a difficult-to-amplify target has less competition from high-abundance molecules in a droplet and can be more easily amplified in the ddPCR System. Therefore, difficult sequences are more efficiently amplified by ddPCR, and their proportions in the library will more closely approximate their relative representation in the initial sample.
Better representation of sequences that are harder to amplify (and may also be less efficiently sequenced) results in a greater number of parallel reads in these areas, resulting in increased read depth and potentially fewer ambiguous regions in the total sequence.
For an optimal sequencing run, accurate quantification is essential (White et al. 2009). Addition of excess library to a sequencing run can result in data that cannot be resolved, or with some NGS technologies mixed signals. Insufficient library input can cause rare targets to be missed, a reduced number of reads, reduced ability to call SNPs, poor coverage in some areas, and gaps in the sequence. Furthermore, for clinical samples addition of insufficient library may impair the detection of rare events and potentially result in misdiagnosis. Inaccurate library quantification increases sequencing costs because ambiguous or missing data usually necessitates at least partial resequencing.
One method for determining the amount of library to use has been to perform titration runs, a costly and time-consuming effort. Assessment of libraries by the measurement of total DNA overestimates the quantity of DNA available for sequencing in the sample because DNA without adapters, primer dimers, etc., is included in the measurement.
qPCR using primers specific for the adapter sequence can be used for quantification, as only sequences that have adapters and thus will be sequenced should be included in the measurement. The use of digital PCR further increases accuracy by providing absolute quantification. Kits for specific applications with the QX200™ Droplet Digital PCR System are now available. For example, for NGS sample preparation, the ddPCR Library Quantification Kit for Illumina TruSeq and the ddPCR Library Quantification Kit for Ion Torrent were designed to increase the speed and accuracy of library quantification. By accurately quantifying the DNA library, the NGS end user can also pool equimolar concentrations of different libraries to get a single sample to load on the sequencer. This approach reduces sequencing costs and has proven to be more reliable than other existing methods on the market.
The type of validation required depends on requirements for quality control and the nature of the features to be verified. While NGS may uncover a genomic anomaly linked to a phenotype (mutation, CNV, rearrangement), validation by a different method is still required, especially to obtain finer quantification data than that obtained by sequencing, or to screen larger populations of patients or samples. The use of ddPCR for validation of NGS results has been the object of multiple publications and has the potential to become the standard approach for genomics and transcriptomics research (Boettger et al. 2012, Miyake et al. 2013, Taylor et al. 2014).
It is likely that digital PCR will become fully integrated into many NGS workflows in the next few years. Library quantification, library generation and results validation are only three of the areas in which digital PCR presents major advantages over current methodologies.
Boettger LM et al. (2012). Structural haplotypes and recent evolution of the human 17q21.31 region. Nat Genet 44, 881–885. PMID: 22751096
Miyake K et al. (2013). Comparison of Genomic and Epigenomic Expression in Monozygotic Twins Discordant for Rett Syndrome. PLoS One 21, e66729. PMID: 23805272
Taylor SD et al. (2014). Targeted enrichment and high-resolution digital profiling of mitochondrial DNA deletions in human brain. Aging Cell 13, 29–38. PMID: 23911137
Tewhey R et al. (2009). Microdroplet-based PCR amplification for large scale targeted sequencing. Nat Biotechnol 27, 1,025–1031. [Erratum: Nat Biotechnol 2010 28,178.] PMID: 19881494
White RA et al. (2009). Digital PCR provides sensitive and absolute calibration for high throughput sequencing. BMC Genomics 10, 116. [Erratum: BMC Genomics (2009) 10,541.] PMID: 19298667